I'm running some regressing with a set of variables that are highly autocorrelated. Not surprising since one is an interaction term of the other. These are: effective number of parties (ENP), effective number of opposition parties (ENOP), and their interaction. However, when I run basic OLS and test for multicollinearity, these three (and only these three) show up as having high VIF scores. I want to drop the original two (ENP and ENOP) and only keep the interaction term (ENP * ENOP). I justify this theoretically because it contains much of the same information and actually helps "spread out" (i.e. normalize) the values a bit more. The new interaction term also essentially builds a scale from zero (only the party in power campaigned in that election) to a high number (high ENP *and* ENOP) which suggests higher voter fragmentation. Thoughts?

More Miguel Centellas's questions See All
Similar questions and discussions