that depends on the comparison. There are notable differences between PCA and EFA:
- data reduction (PCA) vs. construction/interpretation of latent variables (EFA)
- formative measurement model (PCA) vs. reflective measurement model (EFA)
- assumption: no measurement error = explained variance can be 100 % (PCA) vs. only shared variance can be explained (not unique or error variance) = explained variance < 100 %.
It is not unlikely that both techniques will produce fairly similar results when only a few items are used, but PCA and EFA are not synonymous.
Thank you, Marcel Grieger . I'm conducting an EFA on a 9-item measure. Prior studies have only ever performed a PCA on the measure. Can you use PCA results of prior studies that indicate a measure has 1-2 components (one with 5 strongly loading variables) to justify the sample size required for an EFA?
You are welcome. I am not surprised that prior studies used PCA instead of EFA. Various meta-analyses have shown great persistency of less adequate techniques when it comes to factor analysis.
Sample size is often talked about in the context of EFA. You will find many threads on the matter on RG although I do not quite understand what sample size you want to "justify".
If you compare PCA with principal axis factoring (a popular extraction method in common factor analysis), you'll tend to find that PCA yields stronger estimated loadings than common factor analysis, unless the number of variables involved is large. In your case, nine variables is not a large number.
I'd like to opine about Marcel's observation about the high rate of use of PCA in the literature, when common factor analysis would likely have been the better choice. A lot of folks lay the blame for this on SPSS (and similar software packages), in which the default extraction method for the "factoring" subprogram is principal components analysis. I suspect a lot of users probably didn't know better or even attend to the fact that this was the case.
Principal component analysis (PCA) and exploratory factor analysis (EFA) have significant similarities and different applications and interpretations. Both models could be misapplied to analyze the same data and yield similar results. Therefore, the use of PCA or EFA should be governed by a study’s underlying theory, sample characteristics, and analysis goals. That being said, EFA is often recommended in social sciences research, particularly those involving measuring psychological constructs or assessing psychometric properties of new scale, because it better represents what we perceive to be true in the population. PCA could be a good alternative if EFA is not attainable due to estimating issues or a large number of items. Here are some insightful reads.
Alavi, M., Visentin, D. C., Thapa, D. K., Hunt, G. E., Watson, R., & Cleary, M. (2020). Exploratory factor analysis and principal component analysis in clinical studies: Which one should you use? Journal of Advanced Nursing, 76(8), 1886–1889. https://doi.org/10.1111/jan.14377
Ferrando, P. J., Hernandez-Dorado, A., & Lorenzo-Seva, U. (2022). Detecting correlated residuals in exploratory factor analysis: New proposals and a comparison of procedures. Structural Equation Modeling: A Multidisciplinary Journal, 0(0), 1–9. https://doi.org/10.1080/10705511.2021.2004543
Goretzko, D., Pham, T. T. H., & Bühner, M. (2021). Exploratory factor analysis: Current use, methodological developments and recommendations for good practice. Current Psychology, 40(7), 3510–3521. https://doi.org/10.1007/s12144-019-00300-2
Nguyen, H. V., & Waller, N. G. (2022). Local minima and factor rotations in exploratory factor analysis. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000467
For those following this thread, here is a very helpful description of how PCA/EFA might produce different factor loadings (credit to @AsadKhan of Uni of Qld for providing this answer):
When variables don’t have anything in common, EFA won’t find a well-defined underlying factor, but PCA will find a well-defined principal component that explains the maximal amount of variance in the data.
When the goal is to measure a latent variable but PCA is used, the component loadings will most likely be higher than they would’ve been if EFA was used. This can mislead someone into thinking they have a well-defined, error-free factor when in fact they have a well-defined component that’s an amalgam of all the sources of variance in the data.
I would object to the first bullet point and suggest an addition to the second:
If "variables don’t have anything in common" (i. e. absence of any correlation), neither EFA nor PCA will produce meaningful factors/components. In extreme cases, Bartlett’s Test of Sphericity would be non-significant which means that the correlation between variables are (overall) not significantly different from zero. The correlation matrix would be identical to the identy matrix (Field 2013, p. 685).
PCA does ideed assume that there is no measurement error present which is why components are not estimated, but rather calculated, utilising the total variance, including the correlation of each item with itself (r = 1) (Field 2018, p. 780).
Best
Marcel
Field, A. (2013). Discovering statistics using IBM SPSS Statistics (4. Ed.). Los Angeles, CA: SAGE.
Field, A. (2018). Discovering statistics using IBM SPSS Statistics (5. Ed.). London: SAGE.