I am analysing soil microbial community data for classification and discrimination. PLS-DA seems more efficiency in seperating microbial groups. what is the difference between PCA and PLS-DA? When to use PLS-DA rather than PCA?
PCA is totally unsupervised. With PLS-DA you do a regression between your descriptors and the group of classes - then you have already from the beginning defined your classes as a response variable, therefore more efficient separation, but then you need to know what classes each observation belongs to.
You can use software SPSS: Option Analyse-Dimension Reducrion-Factor-Extraction-Rrinsipal Components, or Generalized least squares and others.You can see in my profile there are papers where is used PCA - "Classsifing railway station....."You can see an example for diference between two methods on link below
There is a package called pls in R that can be used for PLS-DA. Just define your clusters with integer numbers as a response vector. file:///C:/Users/matand/AppData/Local/Microsoft/Windows/INetCache/IE/LKRCPXIA/pls-manual.pdf
@Nianxun, PCA (also called eigenvector analysis) is unsupervised pattern recognition technique mostly utilized as data reduction and modelling technique. It determines the degree or extent to which variables are related. Large data of many variables are unavoidably superfluous and overlap, the use of correlation matrix generally quantifies these anomalies by extracting the eigenvalues and eigenvectors from the square matrix originated by multiplying the data matrix. PLS discriminant analysis is a supervised technique that uses the PLS algorithm to explain and predict the membership of observations to several classes using quantitative or qualitative explanatory variables or parameters. There are many statistical softwares to perform these techniques such as JMP Pro, R and SPSS.
I was not familiar with this technique until you mentioned it. You may want to look at this paper. It suggests that for two equal size groups PLS-DA is the same as a standard predictive discriminant analysis based on Euclidean distance. I haven't finished the paper myself but it provides details and compares all of the major discriminant analyses. At least at first glance it appears to be an excellent review paper with what looks to be all the necessary equations you'd need to produce you own R code. Anyway, PLS-DA may have some limitations that you may want to at least consider before passing up the standard discriminant analysis tools. This paper should help you decide.