we know AIC and BIC could be used to compare statistical models. But how to justify selecting a model with higher AIC or BIC values if they fit the model better than the ones with lower values? If possible, recommend some good journal papers that faced similar situations and defended their selection of the model.
For example, in one of my ongoing research,
Model 1 (fixed logit model) has AIC 1909.4, BIC 2093, RMSE 2.68
Model 2 (random parameters logit model) has AIC 1909.1, BC 2108, RMSE 0.51 and
Model 3 (also a random parameters logit model with different RP assumptions) has AIC 1911.6, BIC 2137 RMSE 0.51.
Model 1 found 6 fixed variables to be significant (at 90% CI),
Model 2 found 10 fixed variables and 1 random variable to be significant, whereas
Model 3 chose 6 fixed variables and 2 random variables to be significant.
the likelihood ratio test suggests Model 2 is slightly better than Model 1 (p-value 0.099) and way better than Model 2.
The purpose of statistical modeling here is to get the estimation results and use the significant variables for different analyses, so these models are not meant to be used for prediction.
Now based on this informations, which model should I choose?