Hi,
I used Kennard-Stone sample split to split my samples into 70-30% calibration-prediction sets. I calibrated and cross-validated a PLSR model on the 70% of the data and then used the built model to predict the remaining 30% of the samples.
The RMSEP, in this case, is lower than RMSECV.
I think this can happen when you calibrate and cross-validate a model on a very diverse samples set and then predict a much less diverse samples set.
Is that correct?
Has anyone any explanation for that?
Thanks!