I have a problem with my cross-validation R-squared values.

Dear All,

Please I am writing to seek some assistant on computing cross-validation R-squared with R-studio.

I have attached the R-Script as well as the Workspace from my worries.

In brief, I ran model generation using lm in "R-Studio" to get R-squared value which I compared with that of the program (picture attached herein as "R-squared-From-MOE.png") we have been using in our research group.

The R-squared values were similar. I also observe similar cross-validated R-square values when I used leave-one-out cross-validation (LOOCV) in R and the MOE program (confirmed with the attached picture).

However, when I went ahead to try to leave several out (in this case 3), I had cross-validated R-squared values that I have tried searching and cannot explain. They are excessively higher (most around 0.7) as compared to the lm-R-Squared and the LOOCV ~0.36 and 0.23 respectively.

I tried with cv, repeatedcv, rf, and I got in all very high R-squared values as mentioned in the neighbourhoods of 0.7.

My concern is,

How can one explain such a scene, when it is commonly said cross-validated R-squared values should be normally smaller than lm-R-squared as well as LOOCV R-squared values, since leaving many out is considered more robust.

Are there Literatures you can suggest for someone to check and explain this observation.

The R-Script I have attached has the written sample models which you can try to see and get a better description of what I am trying to explain. I have also attached the R-workspace data file used.

Thanks in advance and best regards,

#R #Statistics #Cross-validation

Similar questions and discussions