It is possible that the model is underfitting or overfitting the data. There is always a need to validate the stability of your machine learning model to indicate how well the learner will generalize to unseen data set.

On the other hand, by reducing the training data, we risk losing important trends in data set, which in turn increases error induced by bias.

In K Fold cross validation, the data is divided into k subsets. each time, one of the k subsets is used as the validation set and the other k-1 subsets are put together to form a training set. The error estimation is averaged over all k trials to get total effectiveness of our model. As can be seen, every data point gets to be in a validation set exactly once, and gets to be in a training set k-1times. Interchanging the training and test sets also adds to the effectiveness of this method.

More Ali Naderi's questions See All
Similar questions and discussions