How is it possible to obtain better results on the test set than on the training set?

More Marco A. Wiering's questions See All

Who was the first person in history to implement a machine learning algorithm?

Was it Arthur Samuel with his program on how to learn to play checkers?

01 February 2013 6,925 0 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Marco A. Wiering

I split the dataset 1000 times in 90% training and 10% test data. So the standard error is very small. All samples are therefore 900 times in the training set and 100 times in the test set on average.

Abdulbasit Al-Talabani

This might happen due to the Pinlaize parameter used with the SVM. The seperator hyperplane allows some training samples to override some samples (allow to exceed it) which cause the the (3.3% error). And while you take the test set significanlty smaller than the train set such a result is likely to be obtained... However the result you present show that your data is well clustered around the classes.

Ted Dunning

You can over-fit the hyper-parameters just as you can over-fit a model.

The Iris data set is so tiny that this isn't all that surprising a result.

True, I thought about this possibility. But how can there exist a model trained on training set with optimal hyperparameters so that its test error is even lower? What kind of geometry would such models have? It must have to do with wrongly labeled data in Iris.

We made a study on this phenomenon and this resulted in the attached publication

Conference Paper An Analysis on Better Testing than Training Performances on ...

Iman Tahmasbian

I think this might have happened due to the size of your test set which I believe is small. Why not trying a 10-fold cross validation and see the results?

We used cross validation. See the paper for more details.

Chiranjibi Sitaula

I have also felt same case even for balanced and fine-tuned models. But, I consider only test accuracy and ignore training accuracy.

Manar Abdullah Alassaf

With regard to accuracy, I have solved the problem that related to the high accuracy in the training set (The overfitting), and now the accuracy of the test set is higher than the training set. (What is called this case?)

And, how do I know that the model has reached to fitting in classification?