After I mean-centered two variables (in order to reduce multicollinearity), should I include, other than interaction term, the two mean-centered variables or the two original ones (i.e. not mean centered)?
In the early days of ridge regression, centering and standardizing was recommended. It's not so much anymore though you want to do that if you are going to run ridge. I would say that as long as you are consistent it doesn't matter much which you use in most cases. Best wishes.
Suppose you have a variable 'X' with levels 0, 1, 2, 3, 4. If you create a new term for that model, 'X^2', the levels for this new term are 0, 1, 4, 9, 16. There will be a high degree of correlation between 'X' and 'X^2'.
If you mean center the variables and make 'Xc' -2, -1, 0, 1, 2, your 'Xc^2' term now has the levels 4, 1, 0,1 ,4. The correlation between 'Xc' and 'Xc^2' is now less/zero.
If you look at a lot of the classical texts on "Design of Experiments", you'll notice that all the designs tend to be coded at 2-levels as -1 and +1. By coding everything as -1 and +1 or -1, 0 and +1, when you create an interaction term or a quadratic term, these new terms are less/not correlated to other terms in the model.
In a paper I am trying to get published, I took a 7 factor model and put every single possible 3-way interaction into the model, AAB, ABB, ABC, etc. The VIF for some of these terms was in the thousands when I used the original scale of the data. When I changed it to mean centered data, the highest VIF was now 20 or so.
I compared the final models for the original data vs the mean centered data. For illustrative purposes, I reduced the model based upon p-values alone just to show how different the models could be because of the high VIF for most terms. For showing the changes in VIF and coefficients between the models, and between a "full model" and a "reduced" model, it was phenomenal! (I wouldn't use either model for anything more than to show how bad of an idea it is to use P-values as the only means of reducing a model.)
Andrew, Mean centering a categorical variable does not make sense (at least to me).
Your example is about the correlation between a variable and its square. Try a continous variable. Mean center it and square it. Find correlations. You will see that correlations will not change. Correlation is invariant under linear transformations.
I can not see how this is related to reducing multicollinearity. Mean centering of a continous variable does not have any effect on multicollinearity.
Except to ease interpretation of results mean centering does to impact on the result. This is clearly demonstrated in the book doing statistical mediation and moderation by Paul E. Jose