17 July 2014 16 6K Report

In Multiple Regression, what do you do in a variable selection process when some model(s) with excellent fit (High adjusted R-squared, low Mean Square Error, Low Mallow Cp value) display a high level of multicollinearity (several high Variance Inflation Factors), while models with manageable levels of multicollinearity have much lower values for adjusted R-squared?

Would you run Ridge Regression (more complicated to many users), or would you inspect the regressor variables (or independent variables) with the goal to make the models shorter by eliminating some of the X's, or what else would you do in such a situation?

How damaging to the regression is a high level of multicollinearity, and how do you balance between fitting and prediction; between over-fitting and under-fitting?

This situation happens often to me. Thanks for your feedback! We can all learn from each other here.

More Raid Amin's questions See All
Similar questions and discussions