I want to perform principal components analysis (PCA) and cluster analysis (HCA) on a hydrochemical dataset using SPSS or R. But it is a requirement that data has to be normally distributed before these analyses are done. How is normal distribution of the data accounted for when performing PCA or HCA using R or SPSS.

In R, there is a provision to scale the data either using median and variance or the mean and variance (see an example below). Is this the same as normalising data?

An example of scaling data in R:

> medians mads

Similar questions and discussions