So here is the thing:

  • I ran a Likert scale questionnaire that had 16 questions (n=328) and afterwards wondered if these questions 'overlapped' somehow. Therefore, I generated a correlation matrix and ran KMO and Bartlett's Tests aiming to see if a Principal Component Analysis (PCA) would be a good fit for my data and what I had in mind. The KMO and Bartlett tests came out favorably. The correlation matrix turned up to have no correlation above .6 but the number of correlations in the .1 to .5 range outnumber the ones between 0 and .1.
  • So, based on these results, I ran my PCA (varimax rotation) and got back 4 components with Eigenvalues above 1. The thing is, that these only accounted for 51% of the cumulative % of variance explained.
  • My next step was to try something else, I ran a new PCA this time asking for 8 components (since this would put me above the 70% of cumulative variance explained range). Now the problem I had this time was that I still had some cross-loading going on (four cases of secondary loadings above .4) and, additionally, I still couldn't really make sense of some of the components it offered me.
  • I kept going on experimenting with PCA's with more components and finally got an outcome with almost no cross-loadings and that I could make sense of when I asked for 11 components.

Now here is my question: given the high number of components that I had to ask for in order to achieve this outcome, is my 11 component PCA still reliable? Or should I leave it out of my research?

*BONUS more subjective (and hard) question:*

Let's say my PCA groups two variables into one same component and without cross-loading. Then, when I look at my original correlation matrix (not the PCA matrix's) I see that these two variables correlation is, in fact, only moderate at best. Now my question: Given this scenario, is it even fair to try to draw conclusions and deliberate about the possible reasons for these variable's interrelations? Because what it would seem to me is that the PCA component is telling me one thing and the original correlation matrix something else.

Similar questions and discussions