I have a problem with my cross-validation R-squared values.
Dear All,
Please I am writing to seek some assistant on computing cross-validation R-squared with R-studio.
I have attached the R-Script as well as the Workspace from my worries.
In brief, I ran model generation using lm in "R-Studio" to get R-squared value which I compared with that of the program (picture attached herein as "R-squared-From-MOE.png") we have been using in our research group.
The R-squared values were similar. I also observe similar cross-validated R-square values when I used leave-one-out cross-validation (LOOCV) in R and the MOE program (confirmed with the attached picture).
However, when I went ahead to try to leave several out (in this case 3), I had cross-validated R-squared values that I have tried searching and cannot explain. They are excessively higher (most around 0.7) as compared to the lm-R-Squared and the LOOCV ~0.36 and 0.23 respectively.
I tried with cv, repeatedcv, rf, and I got in all very high R-squared values as mentioned in the neighbourhoods of 0.7.
My concern is,
How can one explain such a scene, when it is commonly said cross-validated R-squared values should be normally smaller than lm-R-squared as well as LOOCV R-squared values, since leaving many out is considered more robust.
Are there Literatures you can suggest for someone to check and explain this observation.
The R-Script I have attached has the written sample models which you can try to see and get a better description of what I am trying to explain. I have also attached the R-workspace data file used.
Thanks in advance and best regards,
#R #Statistics #Cross-validation