Factor analysis requires interval data. For Nominal and ordinal data you need ordinal factor analysis. It is one of the reasons, Rmcdr (in R) does not perform a Factor analysis when the data is nominal or ordinal. Ordinal Factor Analysis is comparable to IRT modelling as it also estimates thresholds between answering categories.
For nominal and ordinal data the axiom of Pythagoras can not be used to derive bounds on the correlations.
The data should be multivariate normally distributed. See Boomsma, On the robustness of LISREL (maximum likelihood estimation) against small sample size and non-normality, 1983 (open access University of Groningen).
For exploratory factor analysis, the Kaiser-Meyer-Olkin index should preferably be > .80 but could be as low as .60, and Bartlett’s test of sphericity should have p < .05 but preferably have p < .001.
In addition, interitem correlations should ideally lie between .15 and .50, though perhaps they could be a bit higher than that - though certainly not higher than .90 or you could have problems with collinearity.
Furthermore, there need to be enough observations. There are discrepant notions in the literature concerning this. In a recent publication, I wrote the following:
Although Pallant (2016) has proposed that a sample size only five times the number of items might be sufficiently large for some factor analyses, Comrey and Lee (1992) described samples with fewer than 50 people as very poor, samples smaller than 100 as poor, and samples with 200 participants as only fair. Dixon (2005) recommended that the sample size for factor analysis should be at least 10 times the number of items involved in the analysis, Hair et al. (2014) recommended similarly that the cases-per-variable ratio should be as high as possible, and Tabachnick and Fidell (2007) recommended that there be at least 300 people in the sample. Other, more complex, methods for determining a satisfactory sample size exist. These methods are based on size of communalities, number of items in the factors, and size of loadings (see Gaskin & Happell, 2014; Hogarty, Hines, Kromrey, Ferron, & Mumford, 2005; Mundfrom, Shaw, & Ke, 2005) . . .
If you would like to check out the article in which the above is written, perhaps to get the references, here are the full details:
Ma, K., Trevethan, R., & Lu, S. (2019). Measuring teacher sense of efficacy: Insights and recommendations concerning scale design and data analysis from research with preservice and inservice teachers in China. Frontiers of Education in China, 14(4), 612–686. https://doi.org/10.1007/s11516-019-0029-1
We are currently converting this article to become open access, and I anticipate putting an author copy onto RG soon.
It is imperative to do Data screening. The process involves checking for and dealing with missing data, which can be done by checking for the frequencies or using the “COUNTBLANK” function in excel. You need to check for the engaged and ungagged respondents to determine outliers. Then check for normality, Linearity, Homoscedasticity, and Multicollinearity.
1. Factor analysis means a few different things. You should specify what you mean. They may have different assumptions.
2. Assumptions for what? There is a lot of output from the typical factor analysis, several statistics. These may vary which are more or less robust with aspects of the data.
3. The first assumption is that you believe that there are underlying latent variables causing variation in the manifest variables. Previous commentators have focused on some details that could be considered after this one.