I was wondering, for discrimination or classification propose is better use LDA instead PLS-DA in NIR data?. What analysis will works better in this kinds of data?. And why?
My experience is that the combination of PLS-DA (in the form of PLS2 with dummy coded class response) and LDA (on the scores from PLS-DA) is beneficial. You get the low dimensional view of the data and remove rank deficiencies through PLS-DA, while getting the most out of the classification by splicing it with LDA. In both methods it is also possible to give higher weights to difficult classes to improve the classifications.
The main reason not to use LDA directly, is that there are often fewer samples than wavelengths measured and high correlation between neighbouring wavelengths, so that LDA will be troubled by multi-co-linearity, resulting in unstable, non-generalizable solutions.
Dear Kristian Hovde Liland , I am so grateful for your answer. Thanks so much.
I read another possibility to tackle the classification problem in nir, and the Support Vector Machine seems a really good option. If you know about this technique, would you like to share with me some advice for using SVM on nirs data?. Do you think that the collinear problems can to affect the SMV in the same way that the LDA?.
SVM may be a good alternative given non-linearities in the data and in the combination with a kernel, e.g. radial basis functions. However, if what you are measuring is a chemical mixture, Beer-Lambert's law usually holds, which means linear methods like PLS-DA and LDA are enough and often more robust to over fitting. Which ever method you choose, validation will be important to assess the needed complexity. Typically you can cross-validate, or if you have eniugh data, even hold some aside for test set validation.
You can also apply PCA to your NIR data and then use the PCA scores in your LDA model. PCA will generate the scores based on the "natural" clustering of your data. When these scores are used in LDA, you will get a more realistic model, but you still need to do cross-validation and external validation. Contrary to PCA, PLS-DA will generate the scores in a way to maximize the class separation, which might be prone to over fitting.
If your major concern is collinearity, then you can also apply regularized discriminant analysis (RDA) directly to your NIR data. It is like a "special version" of LDA to deal with collinear data. However, you can do this if you have more samples than variable numbers, if not, you still have to reduce the dimensionality of your data first by using PCA.