If you use SPSS, you need to do only KMO (test for sampling adequacy) & Bartlett Test for sphericity (to ensure moderate inter-correlations). Normality test can be over looked
"In applying mathematics to subjects such as physics or statistics we make tentative assumptions about the real world which we know are false but which we believe may be useful nonetheless. The physicist knows that particles have mass and yet certain results, approximating what really happens, may be derived from the assumption that they do not. Equally, the statistician knows, for example, that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world."
In that light, your items are not normally distributed. So the question becomes whether the normal distribution provides a model that is good enough to use. (In another of his famous quotes, Box said that all models are wrong, but some are useful, or something along those lines.)
What types of items do you have? If they are Likert-type items (which are pretty common in factor analyses), you would probably be better off, nowadays, using a method designed for ordinal items. I'm not an expert on factor analysis, so I'll let others advise you about what those methods are.
Hello, you could do an EFA or CFA using polychoric correlations (assuming that your latent variable is normally distributed).
For EFA, a good free software is Factor 9.0 (http://psico.fcep.urv.es/utilitats/factor/).
For CFA a good paper is this from Flora and Curran (https://www.statmodel.com/download/floracurran.pdf). The classic method (WLS) is use a polychoric correlation matrix as input and an asymptotic variance-covariance matrix as weight. The problem is that this requires a large sample size (larger as the model have more free parameters to estimate). The robust WLS Flora and Curran refer, as far as I know is only available in MPLUS.
Lisrel have a DWLS method for ordinal items with small sample size, but I’m not aware about research on the performance of this method. And in AMOS you could work with ordinal items using bayesian estimation (I don’t know much about this type of estimation).
why not use R (http://www.r-project.org/) it is free and (with a little time investement) easy to use. The Lavaan package in R can do the CFA with robust estimators (http://lavaan.ugent.be/tutorial/est.html)
The code should look like this (adopt it to your model and items):
As you noted "none of them are normally distributed". What ever analysis you are going to use, this situation warrants careful inspection of the items. The first thing to ask yourself is whether these items were meant to be normally distributed. Did you (or the person creating the items) have normal distributions in mind? Why are they not normally distributed?
One common reason for data not to be normally distributed is often that the data are actually proportions and a simple log transformation would suffice to handle the data. Another far more difficult situation arises when some of the answers have a double meaning (the middle position in the scale may mean a 'true' middle as well as 'I do not know' or 'not relevant'). A further common cause of lack of normality is that the data are not properly centered: the item in question solicits extreme answers and most answers fall into one or two scores.
A good approach to such data is to look at the scales in detail by running a procedure like PRINCALS or HOMALS in SPSS or in R. While doing this you can select different measurement levels for your items treating them first as nominal and then as ordinal. This will give you information regarding the possibilities of transforming the data before running further analysis. This will also show you whether the assumption of ordinal scale measurement as implied in some of the previous answers is proper.
Finally problems with distributions may also arise from a lack of a homogeneous population of respondents: if a questionnaire is administered to two extreme populations one can not expect normality as the result will show the sum of two different distributions. One way of analyzing this to look and compare the distributions of different subgroups or by including some of the background variables in your runs of PRINCALS.
If you use SPSS, you need to do only KMO (test for sampling adequacy) & Bartlett Test for sphericity (to ensure moderate inter-correlations). Normality test can be over looked
All comments helped me to find something about the data. We adapted an instrument and examined it on pregnant mothers. It is about attitudes towards breastfeeding, social support and self efficacy. The scale was first developed to identify mother at risk of early weaning. In our research population, women's positive attitude toward breastfeeding is high and they are supported well. So, they chose higher points when they rated on a 6 point likert scale from strongly agree to strongly disagree. They are really agree.
If I understand you well the answers tend to be mostly on the higher sides of the scales. In such a situation the scores become rather dependent on other aspects than the intended scales. For example, persons having a dislike of choosing extremes on scales will hesitate to select such extremes and because of this the scores more dependent on other sources of variation than the intended construct.
Attitudes towards breastfeeding comprises two factors: women's positive or negative judgment of breastfeeding. in the part which assessed women's positive judgment of breastfeeding, women rated mostly 4
As a prelude to employing FA , the sampling adequacy and the factorability of the data must examined (using SPSS). In ensuring the factorability of the data, the Bartlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) measures of sampling adequacy have to examined.
To Tabachnick and Fidell (2001), the Bartlett’s test of sphericity should be significant (p
I am sorry but I do not understand your description of the scale. As it now sounds you are referring to to two sets of items, one set worded positively and one set worded negatively. In general terms this would normally considered to be be as a single factor not as two. If you recode your items by changing the sign they would all end up on the same side. Psychologically however it may be for some respondents more difficult to choose extreme positive answers than to choose extreme negative answers. These type of differences are often cultural and have little to do with the construct one tries to measure. In research on breastfeeding in Africa in which I have been involved, I also encountered an unexpected negative association between breastfeeding duration with wealth: poor mothers breastfed their children longer because they could feed their babies in other ways while wealthier women copied the western pattern of early weaning and buying commercial baby feeds. This is a clear example of how a research population may not be homogeneous.
If variables are significantly deviating from normality then factor analysis by maximum likelihood method can be adopted as maximum likelihood method is relatively insensitive to the deviation from normality , Ref: Fuller & Hemmerle (1966)
Fuller Jr, E. L., and W. J. Hemmerle. "Robustness of the maximum-likelihood estimation procedure in factor analysis." Psychometrika 31.2 (1966): 255-266.
Sanjay, as a research methodologist I would always advice to (1) look for an explanation of the deviations of normality and look carefully at all histograms and (2) to check whether a transformation of the item scores can reduce the deviations. Very often log transformations can be of help with non-normal distributions. Using a more robust method at first sight may solve everything, but often only hides some of the problems with the underlying data. In general, I am an advocate of robust methods but a proper understanding of the nature of the data is always required.
It is good to transform your data into normality to keep a maximum number of indicators in your factor analysis. At the time of KMO experiment with your non-normal indicators, you can observe that some of the indicators are responsible for bellowing down the benchmark (above 0.5) of KMO. In this case, you have to remove that indicator from factor analysis in order to keep the benchmark of KMO. Therefore, my suggestion is to transform your data into normality first and then go for analysis. Thanks