Hi there!

I have a sample of 2500 data and each having 9 attributes. I divided the set into 75% + 25% for training-testing purpose ( random selection for testing ). In SGB model I have taken 0.05 as learning rate (Shrinkage factor) and 0.5 as the sub-sample fraction for bagging. Each tree has 15 numbers of terminal nodes and all the features (attributes) interactions are allowed. By growing 20,000 trees (i.e. iterating 20,000 times) sequentially I am getting an R2 (R-square) of 99.7 in training data and 98.8 on testing data. In 10 fold cross validation I am getting an R2 of 97.6.

As I have used very low learning rate and bagging concept I assume that the accuracy I am getting is not due to the Over-fitting, and the MSE Vs No. of Tree graph is gradually decreasing without any spikes.

But, as I am iterating it  20,000 times, and getting this much of accuracy I am little bit confused regarding the Over-fitting concept. Please suggest me whether my approach and understandings are correct or not.

Thank you. 

Similar questions and discussions