How can we solve an overfitting problem?

More Bilal Esmael's questions See All

Are combined methods better than a single approach?

Are combined methods better than a single approach in machine learning?

02 March 2014 9,541 5 View

How does one choose which algorithm is best suitable for the dataset at hand?

How does one choose which machine learning algorithm is most suitable for a given dataset?

02 March 2014 9,450 10 View

What is the running time complexity of SVM and ANN?

What is the best, worst, and average running time complexity of SVM and ANN? Why most of machine learning papers report only the classification accuracy, and ignore the running time?

01 February 2014 8,452 5 View

How to make a classifier forget some wrong cases without re-training the whole system?

How to force a classifier to forget some wrong cases without re-training the whole system?

01 February 2014 5,207 2 View

What are the disadvantages of moving average filter when using it with time series data?

Moving-Average Disadvantages.

31 December 2013 10,375 9 View

SVM with large feature space?

Why Support Vector Machine can deal with a huge number of features? Why it works perfectly in Text classification where usually we have a hung amount of features e.g., 1000 features or more.

31 December 2013 4,555 5 View

How can I select the most informative features from a big feature set?

How can we select a feature subset from a huge amount of features (around 1500 features) that will produce the highest possible classification accuracy? Most of the feature selection algorithms do...

11 December 2013 2,935 47 View

When and why do we need data normalization?

Data normalization means transforming all variables in the data to a specific range. My question is when and why do we need data normalization?

10 November 2013 8,014 36 View

What is the difference between machine learning and data mining?

Machine Learning vs. Data Mining.

10 November 2013 10,102 32 View

How to use HMM for Multivariate time series classification

How can I use HMM to classify multivariate time series. The given time series should be segmented to different-length segments, and for each segment a label (class) should be assigned.

10 November 2013 2,355 7 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Nikolay Sergievskiy

Where do you want solve problem of overfitting?

There is learning models stabel to overfiting: boosting, random forest

In neural network exists some methods: dropout, momentum, minibatch and etc.

Robert Lieck

Being pragmatical we usually solve the overfitting problem by just specifying an additional criterion (regularization/prior) that is traded of against fitting the training data. Often a pretty crude regularization (ridge or L1 i.e. Gaussian or Laplacian prior) does a rather good job. And if we happen to have enough data the problem becomes even less critical.

Another way would be to define a regularization/prior using hyperparameters and learn these to. This may be more robust in the case of parameter misspecification but effectively only shifts the problem to a higher level.

If this is not enough we can validate our learning procedure using techniques like cross-validation, which is a means to adjust the regularization/prior. But this may be computationally expensive.

In general there is no "solution" to the problem since we simply cannot know the correct prior distribution over models. So we are forced to choos a prior (whether flat or Gaussian of whatever) based on intuition. And if people say they wouldn't make any assumptions on the prior they are simply not aware of the implicit assumptions... ;-)

Ronald Böck

I agree with Matthias. The learning of parameters is not trivial but helps very much to handle the overfitting. Further, it improves also the results achieved with a system.

Of course, introducing a momentum or on the other side a kind of annealing may help as well.

In general, Robert is right, there is no simple solution for this issue.

Andreas Merentitis

As indicated already by several authors there is no simple solution. Here is a a bit of a more detailed explanation of the how's and why's:

Overfitting occurs when a model starts describing noise instead of the underlying uknown function we target to approximate. The main reasons are:

1) Limited data wiith respect to the complexity of the model. The VC dimension is a very good measure that decribes how complex a model is in terms of active degrees of freedom (related to how many data the algorithm can shatter) and based on that there are equations that give you a pesimistic estimation of the observations you require in order to be "probably approximately correct". This is explained in several textbooks and some intuitive introduction is also available for example in the lessons here:

http://work.caltech.edu/telecourse

2) The data are especially noisy and outliers are present.

Regularization methods, which are one of the general purpose tools for reducing overfitting are usually taking the form of a penalty to complexity, either as a restriction for smoothness or (as indicated in other answears) bounds on the vector space norm. The idea is to pay a small penalty on how well we do in the training sample in order to have a significantly better chance to generalize successfully "out of sample" (i.e., in unknown data).

Beyond regularization, other things can be helpfull for especially the second part of the problem (outliers) . For example the effect of methods such as bagging in reducing the impact of outliers (compared to a simple max margin approach) is explained very intuitively here:

http://www.youtube.com/watch?v=3kYujfDgmNk

Hilde Kuehne

Hi!

Just a guess ... It the overfitting is based on a bias in your data, under- or oversampling as e.g. SMOTE could help:

http://en.wikipedia.org/wiki/Oversampling_and_undersampling_in_data_analysis

Tiago A. Almeida

Very briefly, I would suggest to

1. use regularization (for optimization-based classifiers). Here, you should take some time to set an appropriated value for the regularization parameter;

2. increase the amount of samples in the training set;

3. reduce the amount of features.

Jagan Mohan Reddy

As per the researchers you can avoid over-fitting by applying feature selection algorithm or ensemble learning. I'll suggest that better to go for ensemble learning methods either stacking and voting. These two techniques are having their pros and cons..