What conditions must be met by data to be used for Factor Analysis?

Factor analysis requires interval data. For Nominal and ordinal data you need ordinal factor analysis. It is one of the reasons, Rmcdr (in R) does not perform a Factor analysis when the data is nominal or ordinal. Ordinal Factor Analysis is comparable to IRT modelling as it also estimates thresholds between answering categories.

For nominal and ordinal data the axiom of Pythagoras can not be used to derive bounds on the correlations.

The data should be multivariate normally distributed. See Boomsma, On the robustness of LISREL (maximum likelihood estimation) against small sample size and non-normality, 1983 (open access University of Groningen).

Robert Trevethan

For exploratory factor analysis, the Kaiser-Meyer-Olkin index should preferably be > .80 but could be as low as .60, and Bartlett’s test of sphericity should have p < .05 but preferably have p < .001.

In addition, interitem correlations should ideally lie between .15 and .50, though perhaps they could be a bit higher than that - though certainly not higher than .90 or you could have problems with collinearity.

Furthermore, there need to be enough observations. There are discrepant notions in the literature concerning this. In a recent publication, I wrote the following:

Although Pallant (2016) has proposed that a sample size only five times the number of items might be sufficiently large for some factor analyses, Comrey and Lee (1992) described samples with fewer than 50 people as very poor, samples smaller than 100 as poor, and samples with 200 participants as only fair. Dixon (2005) recommended that the sample size for factor analysis should be at least 10 times the number of items involved in the analysis, Hair et al. (2014) recommended similarly that the cases-per-variable ratio should be as high as possible, and Tabachnick and Fidell (2007) recommended that there be at least 300 people in the sample. Other, more complex, methods for determining a satisfactory sample size exist. These methods are based on size of communalities, number of items in the factors, and size of loadings (see Gaskin & Happell, 2014; Hogarty, Hines, Kromrey, Ferron, & Mumford, 2005; Mundfrom, Shaw, & Ke, 2005) . . .

If you would like to check out the article in which the above is written, perhaps to get the references, here are the full details:

Ma, K., Trevethan, R., & Lu, S. (2019). Measuring teacher sense of efficacy: Insights and recommendations concerning scale design and data analysis from research with preservice and inservice teachers in China. Frontiers of Education in China, 14(4), 612–686. https://doi.org/10.1007/s11516-019-0029-1

We are currently converting this article to become open access, and I anticipate putting an author copy onto RG soon.

Adasa Nkrumah Kofi Frimpong

It is imperative to do Data screening. The process involves checking for and dealing with missing data, which can be done by checking for the frequencies or using the “COUNTBLANK” function in excel. You need to check for the engaged and ungagged respondents to determine outliers. Then check for normality, Linearity, Homoscedasticity, and Multicollinearity.

For details on the above, check the link below.

http://statwiki.kolobkreations.com/index.php?title=Data_screening

Regards

Daniel Wright

Three things.

1. Factor analysis means a few different things. You should specify what you mean. They may have different assumptions.

2. Assumptions for what? There is a lot of output from the typical factor analysis, several statistics. These may vary which are more or less robust with aspects of the data.

3. The first assumption is that you believe that there are underlying latent variables causing variation in the manifest variables. Previous commentators have focused on some details that could be considered after this one.

How can i calculate the patch impedance at the edge of a circular patch?

What is the best method to carry out and monitor enzymatic sucrose hydrolysis for invert sugar production?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hello Everyone ! I'm looking for a good journal to publish my manuscript with low publication cost?