Hi, when I perform a PCA with a genelist coming from a comparison of study groups, some of the genes with smaller fold changes and lower statistical significance obtain larger loading values. Would you be able to explain this?
The p-value (statistical significance) measures the difference in relation to the variability. The loading values are the coefficients of the linear combination in the direction of the largest multivariate variability. The latter are influenced by the other variables in the data set (their correlation structure). So while there exists some sort of relation, they cannot be compared directly - maybe except for the special case of having just one variable (gene).
You could do robustness checks by removing some variables from the data set that show up on the same principal component axis (a highly correlated one, for example).
Further to Andreas Krause's comments, don't forget that PCA is usually a technique performed to explore/describe/summarize the variation observed in your data -- but without any notion of group membership being accounted for. That is, PCA is an unsupervised method.
In comparison, your p-values and fold-changes are describing the differences between the groups. This would definitely count as a supervised analysis (though the term is not usually used when looking at one predictor at a time).
The idea might be more familiar if you look at a classification method like Discriminant analysis rather than just one predictor at a time methods.
PCA looks at the patterns of variation in a set of variables X1, .., Xp.
Discriminant analysis uses very similar methods to look at the pattern of variation *between* two groups.
Crudely speaking,
PCA: ~ X1 + X2 + ... + Xp
Disc Analysis: Group ~ X1 + X2 + ... + Xp
PCA is unsupervised; DA is supervised. These address different questions, so the importance of the X variables can change substantially.