I have a data-set about cities features like population, poverty, economic position, income etc. population, poverty and economic position are continues but poverty is based on a scoring. We have 100,70,50 and 30 scores for cities (only these values). cities with 100 score has highest poverty and cities with 30 score has lowest poverty. Now i want use PCA for this data-set.

* Is PCA suitable for this mixed data because one of the assumptions of PCA is continues inputs. If not, what are suitable techniques to do that? Finally i need rotated component matrix based on varimax rotation and need a software or package to do the rotation on loadings for me.

* Should i standardize data when using correlation matrix (not covariance) in PCA? (currently because of some theoretical aspects i select a city as base city and it's feature values as denominator of other cities values + removing outliers using x-mean(x)>=2S.D rule)

* Should i use correlation matrix here? we use it when we have different scales for features but here my data is scaleless.

* Normal distribution of inputs is an assumption of PCA? because my inputs don't show normal distribution, should i transform data using log()?

More Eghbal Rahimikia's questions See All
Similar questions and discussions