I know the basis of Principal components analysis (PCA). But I have a question regarding a situation when we derive many variables from two variables. For example, in the analysis of crop tolerance, when we have YP (yield in normal condition) and YS (yield in stress condition), we make new indices such as TOL = YP-YS or MP = (YP*YS)/2 which are highly correlated with each other. Could we perform PCA on the YP and YS variables along with indices which are making by linear or nonlinear equation from them? There are plenty of articles in Google scholar which have used these approach. In fact, we make many variables deliberately, and we reduce their dimensions using PCA!! Is it correct to do PCA for the aforementioned data?

Similar questions and discussions