Number of initially variables varies from 8 to 50. In some cases even it is more than 50. Let us assume if one has 20 initial variables, what will be optimal sample size for the purpose of collecting data.
As far as I know, simulation studies have not confirmed the importance of the subjects-to-variables ratio in EFA. It seems that, apart from the sample size, the average communality and the variables-to-factors ratio (factor overdetermination) are most importants determinants of the accuracy of the exploratory factor solution. This article may be particularly elucidating:
MacCallum, R.C., & Tucker, L.R. (1991). Representing sources of error in the common-factor model: Implications for theory and practice. Psychological Bulletin, 109, 502-511.
The following ones may also be of interest:
MacCallum, R.C., Widaman, K.F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84-99.
MacCallum, R.C., Widaman, K.F., Preacher, K.J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36, 611-637.
Kristine Y. Hogarty, Constance V. Hines, Jeffrey D. Kromrey, John M. Ferron and Karen R. Mumford (2005). The Quality of Factor Solutions in Exploratory Factor Analysis: The Influence of Sample Size, Communality, and Overdetermination. Educational and Psychological Measurement 2005 65: 202
Although there are rules of thumb such as 5 or 10 participants per item, these are simply rules of thumb. Sometimes smaller EFA studies with ns of 150 (depending on the type of FA and number of items) can be useful. Thus, rather than combining across small samples to get a large n, you learn about what a specific samples factor structure looks like. And, many more small scale studies can help to note if there are differences across locales, etc. Both (small and large scale) are needed.
Regardless of the rules of thumb, I think you should back-up your decisions or evaluations about sample size with appropriate power analysis. Here is a list of papers that can help you with this issue. See in particular the Monte Carlo simulations to determine sample size in SEmodels (CFA or EFA not excluded). I include also some citations about latent growth analysis below. Hope it helps!
MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1(2), 130–149. doi:10.1037/1082-989X.1.2.130
Meade, A. W., & Bauer, D. J. (2007). Power and Precision in Confirmatory Factor Analytic Tests of Measurement Invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(4), 611–635. doi:10.1080/10705510701575461
Miles, J. (2003). A framework for power analysis using a structural equation modelling procedure. BMC Medical Research Methodology, 3, 27. doi:10.1186/1471-2288-3-27
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling: A Multidisciplinary Journal, 9, 599–620. doi:10.1207/S15328007SEM0904_8
Stephenson, M. T., & Holbert, R. L. (2003). A Monte Carlo Simulation of Observable Versus Latent Variable Structural Equation Modeling Techniques. Communication Research, 30(3), 332–354. doi:10.1177/0093650203030003004
Thoemmes, F., MacKinnon, D., & Reiser, M. (2010). Power Analysis for Complex Mediational Designs Using Monte Carlo Methods. Structural Equation Modeling: A Multidisciplinary Journal, 17, 510–534. doi:10.1080/10705511.2010.489379
Wolf, E. J., Harrington, K. M., Clark, S. L., & Miller, M. W. (2013). Sample Size Requirements for Structural Equation Models An Evaluation of Power, Bias, and Solution Propriety. Educational and Psychological Measurement, 0013164413495237. doi:10.1177/0013164413495237
For Latent Growth Models
Fan, X., & Fan, X. (2005). Power of Latent Growth Modeling for Detecting Linear Growth: Number of Measurements and Comparison With Other Analytic Approaches. The Journal of Experimental Education, 73(2), 121–139. doi:10.3200/JEXE.73.2.121-139
Fan, X. (2003). Power of Latent Growth Modeling for Detecting Group Differences in Linear Growth Trajectory Parameters. Structural Equation Modeling: A Multidisciplinary Journal, 10(3), 380–400. doi:10.1207/S15328007SEM1003_3
Wänström, L. (2009). Sample Sizes for Two-Group Second-Order Latent Growth Curve Models. Multivariate Behavioral Research, 44(5), 588–619. doi:10.1080/00273170903202589
The Muthén & Muthén (2002) article contains an explanation for a CFA and a latent growth model, as well as most of the other general articles I suggested. I warmly encourage you to use the Muthén and Muthén approach, even though it is quite laborious.
In EFA, in a very practical way, never use a sample below 200 participants and the general rule of 10 people per item I think is quite good. So, with 20 variables, a size of 200 could be enough. But my advice is to work with something larger sample sizes to a minimum recommended.
As far as I know, simulation studies have not confirmed the importance of the subjects-to-variables ratio in EFA. It seems that, apart from the sample size, the average communality and the variables-to-factors ratio (factor overdetermination) are most importants determinants of the accuracy of the exploratory factor solution. This article may be particularly elucidating:
MacCallum, R.C., & Tucker, L.R. (1991). Representing sources of error in the common-factor model: Implications for theory and practice. Psychological Bulletin, 109, 502-511.
The following ones may also be of interest:
MacCallum, R.C., Widaman, K.F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84-99.
MacCallum, R.C., Widaman, K.F., Preacher, K.J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36, 611-637.
Kristine Y. Hogarty, Constance V. Hines, Jeffrey D. Kromrey, John M. Ferron and Karen R. Mumford (2005). The Quality of Factor Solutions in Exploratory Factor Analysis: The Influence of Sample Size, Communality, and Overdetermination. Educational and Psychological Measurement 2005 65: 202
I cannot give You the sources, but to the best of my knowledge, sample size should be AT LEAST 5 subjects per variable. At this minimum level, FA should already work (it worked in my case). It's useful to analyze possibly large group, as mentioned above, though.
The first thing we should acknowledge is that EFA is not an inferential analysis (so the notion of power is irrelevant. CFA on the other hand tests hypotheses regarding both the overall model fit, and individual coefficients.
I am also assuming the Netra is asking about EFA for the standard purpose: For coming up with a likely measurement model that might be subsequently tested using a CFA run on (hopefully) an independent dataset.
With these things in mind we have to note that situations (number of items, numbers of factors, strengths of associations between items and factors[loadings]. tier-factor associations) will vary so much from study to study, and we are very unlikely to have a good idea about at least some of these things before we start, that formal rules are likely to be useless. For this reason we are restricted to rules of thumb for multivariate analyses (such as this), or indeed any multivariable model.
Wherever possible, I try to use a 10-20 per item rule of thumb but I would like to mention two qualifications for this rule of thumb. If I have very FEW items (e.g. 5-10) I would probably incase this to 20-25 per item whereas if I have many items (e.g 40 or so), practical considerations will drive me to the lower end (e.g. 10 participants per item).
Finally, if my workupp involves both a EFA and a CFA, I would probably just want "enough' for my EFA and spend more sampling effort on the CFA. The fact is we just want the EFA to paint a reasonably food picture of reality (it is the job of the CFA to Test the fit of our measurement model). In this respect, I would use 25-33% of my sampling effort for the EFA, and 66-75% on the CFA. Note I am not talking about a sample split here. In an ideal world we collect data for our EFA and CFA on two separate occasions (provides strainer evidence for external validity)
Personally, I do not know of an authoritative "optimum" sample size. Probability sample sizes depend on the population size and heterogeneity, a margin of error, and confidence level. Usually, researchers blend these factors to accommodate available fund, ethical concerns, usefulness, scholarliness, and other factors. The accomplishment of the researcher's objectives, on the basis of the blend of these factors, may result to an "optimum" sample size, for the particular situation.
I got 2 methods from my teachers of research, 1. number of variables or statements in questionnaire (20) multiply (*) with ten (10) = 200. 2. minimum 100 sample size for 8 variables or statements.
Otherwise, SPSS checks the sample adequacy through KMO, which should be greater than .5, it means sample size for EFA was adequet.
@ Aleksander, Shmuel & Cameron: All of you state variations of the persons-to-variables rule of thumb. Can you provide any (either theoretical or empirical) justification for this? Namely, the papers I mentioned before indicate that this kind of rule is not sound (neither theoretically nor empirically).
@Cameron: I do not agree that EFA is not an inferential procedure. For instance, the seminal book by Lawley & Maxwell (1963) is just about putting EFA into the framework of inferential statistics. Admittedly, this stance has been heavily criticised by some (e.g. H. Kaiser), but it remains a legitimate (and far from isolated) position. At least we should say that EFA can be treated as an inferential procedure, although this aspect is not necessarily in the focus. Otherwise, why would we bother with computing, for instance, standard errors of EFA loadings (a topic which has been worked upon quite a lot)?
@Rahul: KMO does not check whether the sample size is OK. It attempts to check whether the correlation structure can be meaningfully described by a set of common factors.
Point 1: Is (or should) EFA be considered inferential?
I am sure that some people may use EFA in an inferential capacity, however whether it should be is another question. Just because some take this stance does not necessarily legitimize it. I also think some people calculating the standard error on a statistic (alone) shouldn't be a reason to include something as 'inferential'. The standard error can also be used as a 'quality measure' for our statistics (how much we would expect a statistic to vary from sample to sample), only taking the extra step of making statements about the population values (either in terms of hypothesis testing or confidence intervals), makes it inferential. Just to illustrate this point, many people will present summary statistics in their publication along with their standard errors, is this an inferential analysis. No. This is just a summary statistic with a standard error next to it. They generally do this to describe the sample (descriptive), without statements about the target population (i.e. no inferences).
Your point also raises another question: WHY??? Why use EFA in an inferential capacity where CFA provides a much stronger framework for modeling. The Lawley & Maxwell text your cite was written 51 years ago at a time when they was no CFA. At this stage we only had EFA. We have moved on a little since then (actually since 45 years ago when CFA developed).
Finally, I should make one qualification. I am talking about the use of EFA and CFA is the process of building measurement models. Your point may be for analyses outside of this area.
Point 2: Theoretical and/or empirical justification for rules of thumb
I did spend some time in my last post about this point but perhaps I need to elaborate. In reality, there is so much variation in the ingredients of an EFA that it is very difficult to come up with a theoretical basis for a sample size calculation in this area. The number of items, number of factors, magnitude of item-factor and inter-factor associations, and factor-analytic model are all likely to drive the sample size needed. The fact is that people usually know so little about these things beforehand that to force unrealistic guestimates into any formal SS calculation is likely to give another wild guestimate of the sample size. Also, at present we have no idea about what that SS formula would look like anyway. To some extent the same can be said for empirical evidence. The study situations vary so much that to give a magic number (or formula) probably isn't going to work. Any simulation studies are likely to only cover only a very small subset of the 'realities' we are going to come across, so they may offer some insight, but this will be limited.
So we come to rules of thumb. BTW Gregor the whole point of a rule of thumb is that it won't have a theoretical justification. Hence the name "Rule of Thumb". To some extent, rules of thumb are empirical (in a very informal way). It comes from researchers who have done a lot of these analyses and have GENERALLY found, that these sorts of rules work 'most of the time'. But I also understand your reason for asking too. The fact is reviewers (especially in some disciplines) are of so obsessed with magic numbers (and formulae), that they are almost always going to ask for references to back up your decisions. Requiring justification is not always a bad thing, but often reviewers can be a little too dogmatic. This shows that these reviewers themselves have a limited understanding of the methods.
Life would be much easier if every study was an RCT (so we had theoretical support for our sample size calculations), but this is not the case. To force multivariable and multivariate observational studies into an experimental framework will just give us silly and unrealistic numbers. Convincing reviewers of this???? That is your job, but if they ask you for a formal sample size calculation for an EFA, you are almost half way there (they don't appreciate the complexities underlying studies of this type).
Actually, I rarely find much benefit to doing EFA first. Use theory to make hypotheses and then test them with CFA. When I go to EFA, it's usually because my hypothesis-driven CFA is clearly off the mark.
As the names suggest, I would say it is closer to say EFA is to explore the structure of your factors, and CFA is to confirm that the structure works (That is, EFA to draw the picture, and CFA to check whether it fits reality). Given these two different roles it is important that the EFA and CFA should be run on different datasets (otherwise overfitting is likely). Ideally these two samples should be collected on two different occasions, but if that isn't possible, randomizing your observations in your dataset to two different datasets might be the next best option.
Again, I should qualify that I am talking about the standard approach for the workup of a psychometric (or in my case, mostly clinimetric) instrument. BUT, I have to agree with Pat here. If you have done your homework and are going through the whole process (starting by designing and content validating your own items), then the EFA should (but not always) tell you what you know already (that the items collect in much the same way that you have designed them to cluster). So going straight to the CFA is probably a better option. You would only come back to EFA if respecifying (tweaking) your model in the CFA fails.
In my opinion, the inferential aspect should always be important when running EFA (I wouldn’t like to enter the discussion about the usefulness of EFA vs. CFA in general, because it’s quite off-topic here). If you run EFA, you should be always interested at least in confidence intervals for factor loadings to get an impression about the accuracy of your solution. This is a purely inferential thing, and the information obtained is qualitatively different from information obtained by CFA, since it relates to a different kind of model. You may not very often use/need an unrestricted model, but this is a different question. Of course, the L&M book is can be considered old now (I cited it as a pioneering work), but software packages like Mplus or CEFA (both providing inferential information for EFA) are quite state-of-the-art products.
2. Rules of thumb
Thank you for the clarification of the exact meaning of the term “rule-of-thumb”, but my point nevertheless remains the same: if you use some kind of a simplified rule or set of rules, it’s better to use one that has some kind of empirical and theoretical background. You’re right when you criticize the limited ecological validity of simulation studies, but on the other hand their results are the best empirical evidence we have (which is certainly more valid than the researcher’s intuitive impression about what might be important). After all, practically all empirical (and especially experimental) sciences build a complex picture based on pieces of evidence collected from oversimplified situations.
Gregor Socan - you wrote " All of you state variations of the persons-to-variables rule of thumb. Can you provide any (either theoretical or empirical) justification for this? " Have you read this paper?
Costello AB, Osborne JW. Best practices in explanatory factor analysis: four recomendations for getting the most from your analysis.Prac Assess
@Ulf: If I didn't overlook something, Costello&Osborne kept their number of variables fixed and varied only the number of subjects. What they interpret as the effect of the subject:item ratio, may therefore simply be the effect of the sample size.
That depends of the number of the sample size. The minimum optimal sample size is 50. If you have only 30 observations you can not use FA but only cluster of cases
@Maria: Factor analysis on 50 observations will usually result in very unstable estimates (at least with psychological/social science data - if you analyze very accurate chemical measurements, N=50 may be quite OK). In my opinion, one should examine the standard errors very carefully whenever your sample size is smaller than, say, 200.
seems to be an important topic that such a short question result in so many responses.
The short answer is that any estimate will become more precise with any participant you add. So there is no upper limit.
Depending on the level at which you want to interpret your results the minimal sample size will also be quite high. In a recent paper we found that claims about individual items from a 25-item Big Five questionnaire require more than 10.000 participants.
You should probably try to get as many as is feasible for you and use bootstrapping to test the robustness of your results in your data.
Ferguson and Cox (1993) suggest that 100 respondents is the absolute minimum number to be able to undertake exploratory factor analysis. However, others would suggest that this is insufficient and a rule of thumb would be five respondents per item
(Bryman & Cramer 1997). This type of analysis must follow a predefined and systematic analytic sequence (Ferguson & Cox 1993).
I'm looking into this for SEM. However, I'm new to this, but I think I got it right. There has been a lot of work in this topic since 2000 resulting in nuances that were not previously included in the discussion.
Hayes (2013) identified two power issues in SEM modeling: 1) identifying whether a parameter is significantly different from another estimated parameter, and 2) whether the model is a reasonable or ridiculous model. Calculating power for a SEM model to identify parameter significance can be challenging (Hayduk et al., 2007). Historically, the general guideline for SEM is to have a sample size greater than 200 cases in the dataset (Barrett, 2007). However, as pointed out by Little (2013), there are several factors that can influence the quality of the data influencing the accuracy of the reported means, standard deviations, and correlations (sufficient statistics). The NCS and CPES studies use random sampling techniques and provide a wide variety of items for most construct parameters, both of which support sample size of 120 participants based using Hayes’ criteria to identify sample size:
(1) the heterogeneity and representativeness of the sample, (2) the precision of the measures in terms of reliability and scaling (e.g., ordinal scales are less precise than interval scales), (3) the convergent and discriminant validity of the indicators and constructs, and (4) model complexity (i.e. comles models typically have highly correlated parameter estimates, which makes estimating them harder with small sample sizes). (p. 121)
Studies that use random sampling techniques do not need to be as large as a convenience sample. Strategies to identify the items used in the constructs’ parameters can further support smaller sample sizes. The precision of the indicators are stronger when variables using items that are all positively correlated at greater than .6 with one another. This level of correlation allows SEM software to easily identify correct optimal solution for each construct’s parameters. Limiting the variability of the items correlation (using tau-equivalent items) is preferable to using strong and weakly correlated items (congeneric items). Little references MacCallum, Widaman, Zhang, and Hong’s (1999) finding that general sample size guidelines in exploratory factor analysis have no basis to support his own finding that sample sizes greater than 120 is adequate for any modeling when each construct contains three tau-equivalent items. These sample sizes provide measurements for loadings, intercepts, and residuals that are locally just identified for any model size. Modest sample sizes are supported by good construct parameter variable development with cross-sectional data. It is possible that not all factors can be developed using three items that are tau-equivelant. If there are factors that demonstrate a lack of metric invariance, the estimated factor loadings will be evaluated by looking at the confidence intervals to clarify the nature of the lack of fit to identify whether to remove of items or allow for partial invariance (Meade & Baeur, 2007). To conduct between group analysis, Little (personal communication/2013) suggested a minimum group size of 50-75 participants per group.
Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. New York, NY: Guilford Press.
Hayduk, L., Cummings, G., Boadu, K., Pazderka-Robinson, H., & Boulianne, S. (2007). Testing! testing! one, two, three--testing the theory in structural equation models! Personality and Individual Differences, 42(5), 841-850. 10.1016/j.paid.2006.10.001
Barrett, P. (2007). Structural equation modelling: Adjudging model fit. Personality and Individual Differences, 42(5), 815-824. 10.1016/j.paid.2006.09.018
Little, T. D. (2013). Longitudinal structural equation modeling. New York, N.Y.: Guilford Press.
MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological methods, 4(1), 84.
Meade, A. W., & Bauer, D. J. (2007). Power and precision in confirmatory factor analytic tests of measurement invariance. Structural Equation Modeling, 14(4), 611-635.
thank you for your response. It is good input but loading etc will be known after analysis but one need sample before data collection. Can we think of writing an high quality monograph on the topic.
Factor analysis (and also PCA) is a data analytic tool/method. As pointed out by Cameron Paul Hurst the use of EFA as an inferential method is questionable. Treating the data as a "sample" implies you are intend to apply the conclusion to the population from which the sample was drawn. However the derivation of the equations for EFA usually are not based on any statistical assumptions (although sometimes multivariate normality is assumed). EFA is best used to ask questions about the data set itself in which case the question of the sample size is not relevant. Finally EFA (and PCA) are multivariate methods and sampling design is more complicated in that case.