I am trying to use Mallows Cp to select best NIR calibration model. For each sample-spectrum there are 700 wavelengths (high degree of multicollinearity), so, there are potentially 700 predictors. NORMAL PRACTICE IS TO COMPRESS THESE 700 VARIATES INTO PCA-SCORES AND/Or PLS-DIMENSIONS. Using PCA scores I have many sub-set calibration models with terms (e.g.) one to 15 and one option is to use Mallows Cp statistic to find the best sub-model:
Mallows Cp = resSS/📷 - N + 2p
I have 📷 estimate from a Full model (with 32 terms) and residual sum of squares from all sub-set models (N=54). Plot of “Cp vs model-terms” looks like a negative exponential trend instead of the plot in Mallows Paper with [positive linear trend]. If residual sum of squares IS NON-INCREASING WITH FIXED 📷, how can we get a positive trend of “Cp vs model-terms” plot.
I would like your help in the resolution of this contradiction (all articles online present plots similar to the one in Mallows Paper)