Hi,

 I try to analyze data set with numerous independent variables of which some are categorical (two or three levels each) and some are continuous. Unfortunately, several of the categorical (and continuous) variables are correlated among each other (e.g. level coded as 1 in one variable occurs together with level 1 in other variables). I try to find the optimal way to reduce the number of explanatory variables that can be later used in GLM analysis. Specifically, I wonder how to reduce the number of categorical variables in a way analogous to PCA to get uncorrelated variables. I have noticed that in SPSS there is  a PCA for categorical variables. However I am curious if there are some other solutions or statistical methods, especially implemented in R. I would appreciate every suggestion how to cope with such kind of data.

Regards

Piotr

Similar questions and discussions