I am working on identify which explanatory variables could be interesting to add in a mechanistic model on soil carbon dynamic.
I am able to calibrate a simple model on experimental data from several sites. This model is like an average model without explanatory variables and so don't simulate the variability existing between the different sites. I have some informations about the different sites (soil properties) which could improve the predictive quality of my model.
I can estimate the MSEP of the "average" model and I'd like to estimate the population part (lambda) of the MSEP decomposition according to Bunke and Droge (1984) or Wallach and Goffinet (1987). This part represent the minimum MSEP we can get with the explanatory variables present in the model. The bigger this part is (relatively to the MSEP) the most we have to add explanatory variables to improve the predictive quality of the model. This term depends on how much the predicted variable (y) varies for fixed values of the explanatory variables (X) in the model : lambda=E[var(y|X)].
I found that when the explanatory variables are categorial, we can estimate lambda by the mean square error of the residuals of a linear model between y and X which seems logical for me. I first thought that we can do it the same way with continuous explanatory variables but I doubt now because of the linear hypothesis which can be a contribution of the squared biais part of the MSEP decomposition (Delta).
Have you any suggestions of how I estimate the lambda part of the MSEP decomposition?
Thanks for the help!
Benjamin