What are the general suggestions regarding dealing with cross loadings in exploratory factor analysis? Do I have to eliminate those items that load above 0.3 with more than 1 factor?
There is some controversy about this. Normally, researchers use 0.50 as threshold. However, other argue that the important is that items loadings in main factor are higher than loadings in other (they do not provide any threshold). Other also indicate that there should be, at least, a difference of 0.20 between loadings. For example, if an item loads 0.80 in one factor, the highest loading of this item on the other factors should be 0.60.
There is some controversy about this. Normally, researchers use 0.50 as threshold. However, other argue that the important is that items loadings in main factor are higher than loadings in other (they do not provide any threshold). Other also indicate that there should be, at least, a difference of 0.20 between loadings. For example, if an item loads 0.80 in one factor, the highest loading of this item on the other factors should be 0.60.
I have seen in some papers exactly the same as you have mentioned regarding 0.20 difference. In my case, I have used 0.4 criteria for suppression purpose, but still I have some cross-loadings (with less than 0.2 difference). General purpose of EFA is to retain those items that load the highest on one factor but do I have to eliminate the ones with cross-loadings in order to get independent factors (not correlated) ?
its upto you either you use criteria of 0.4 or 0.5. You can use it. But you have to give proper reference to support it. Cross loadings natching the criteria can be used for further analysis.
What if I used 0.5 criteria and I see still some cross-loading's that are significant ? How should I deal with them eliminate or not? I tried to eliminate some items (that still load with other factors and difference is less than 0.2) after suppressing and it seems quire reasonable and the model performance also has improved. Afterwards I plan to run OLS and I need independent factors.
Which software are you using? New tendencies in PLS-SEM recommend establishing discriminant validity via a new approach, HTMT, that has been demostrated to be more reliable than Fornell-Larcker criterion and cross-loading examination. According to them, cross-loadings should only be checked when HTMT fails, in order to find problematic items between construct.
Read this paper: https://link.springer.com/article/10.1007/s11747-014-0403-8
SmartPLS computes HTMT matrix directly, but I think should be able to compute it manually using the formula (which includes correlations among constructs).
In general, we eliminate the items with cross loading (i.e., items with loadings upper than 0.3 on more than 1 factor). But, before eliminating these items, you can try several rotations.
In practice, I would look at the item statement. Cross-loading indicates that the item measures several factors/concepts. This item could also be the source of multicollinearity between the factors, which is not a desirable end product of the analysis as we are looking for distinct factors. My point is that, do not rely solely on the factor loading value or specific cutoff, also take a look at the content of the item. The item statement could be too general.
I know that there are three types of orthogonal rotations Varimax, Quartimax and Equamax. Most widely used is Varimax, however can you simply tell me what is the difference between Quartimax and Equamax rotation methods?
Maybe this helps: http://support.minitab.com/en-us/minitab/17/topic-library/modeling-statistics/multivariate/principal-components-and-factor-analysis/methods-for-orthogonal-rotation/
Thank you for you feedback. I have checked determinant to make sure high multcolliniarity does not exist. Then I have checked for reliability for items (cronbach's alfa) and it quite high. Moreover, I have looked at correlated-item total correlation. It turned out that two items correlate quite law (less than 0.2) with scale score of the rest of the items. So, I have excluded them and ran reliability analysis again, cronbach's alfa has improved. But, still in factor analysis I have very few cross correlations that bothers me and as it is suggested I have to check other orthogonal rotations, before eliminating problematic items.
I think that elimitating cross-loadings will not necessarily make your factors orthogonal. I mean, if two constructs are correlated, they may remain correlated even after problematic items are removed. Do all your factors relate to a single underlying construct? In that case, I would try a Schmid-Leiman transformation and check the loadings of both the general and the specific factors. Additionally, you may want to check confidence intervals for your factor loadings.
yes, you are right all the factors relate to the same construct (brand image). What do you mean by "general" and "specific" factors? I have never used Schmid-Leiman transformation? What are the decision rules? or can you suggest any material for quick review?
Can Schmid-Leiman transofrmation be used when I have results with varimax rotation. I guess it needs pattern matrix results for analysis? or am I wrong ?
Davit, I'm attaching Wolff and Preising's paper for a quick and readable introduction to the S-L transformation. As for the actual computation, I don't know what software you're using, but Wolff and Preising present syntax for both SPSS and SAS. You can also do it by hand (I have an Excel file for this, but I don't have access to it now), but I'd suggest you use the free software FACTOR (http://psico.fcep.urv.es/utilitats/factor/). I'm also attaching Baglin's (2014) didactic tutorial about this program.
As far as I looked through quickly the first paper, Schmid-Leiman technique is used to transform an oblique factor analysis solution containing a hierarchy of higher-order factors into an orthogonal solution. I have used varimax orthogonal rotation in principal component analysis. What would you suggest? What do you think about the heterotrait-monotrait ratio of correlations?
Have you tried oblique rotation (e.g. Promax etc)? That may reveal the multicollinearity by looking at the "Factor Correlation Matrix" (in SPSS output, the last table). I assume that you are analyzing health related data, thus I wonder why you used orthogonal rotation. In my experience, most factors/domains in health sciences are better explained when they are correlated as opposed to keeping them orthogonal (i.e factor-factor r=0). The extracted factors are also easier to generalize to CFA as well whenever the rotation is oblique.
In addition, very high Cronbach's alpha (>.9, ref: Streiner 2003, Starting at the beginning: an introduction to coefficient alpha and internal consistency) is also indicative of redundant items/factor, so you may need to look at the content of the items.
If I use oblique rotation, then I will have a problem in linear regression. I need to get factors that are independent with no multicollinearity issue in order to be able to run linear regression. After I extract factors, goal is to regress them on likeness of the brand measured with o to 10 scale. Plus, only with orthogonal rotation is possible to to get exact factor scores for regression analysis. I have checked correlation matrix and also determinant, to make sure that too high multicollinearity is not a case >0.9.
I have checked not oblique and promax rotation. In both scenarios, I do not have to high correlations. Anyway, in varimax it showed also no multicollinearity issue.
I have one question. If I have high multicollinearity issue between my variables (determinant less than 0.00001) than should I first get rid of the variables causing this and then use oblique or promax rotations?
Given your explanation, using orthogonal rotation is well justified. In that case, you may need to look at the correlation matrix again (I find it easier to work with the correlation matrix by pasting the spss output in ms excel). I would manually delete items that have substantial correlations with all or almost all other items (e.g >.3) and run the EFA again. That might solve the cross-loading problem.
If the determinant is less than 0.00001, you have to look for the variables causing too high multicollinearity and possibly get rid of some of them.
I made mistake while looking at correlation matrix determinant which actually shows the following figure 2.168E-9 = 0.000000002168< 0.00001 (so definitely i have high multicollinearity issue). Firstly, I looked items with correlations above 0.8 and eliminated them. Still determinant did not exceed the threshold. Then I omitted items with correlations above 0.7 and now my determinant is 0.00002095> 0.00001. from 24 initial items I retained only 17 and now I can run EFA. What do you think about it ?/any comments/suggestions ?
However, I would be very cautious about it, since literature suggests that if multi-collinearity is between 5 and 10 is considered as high. In factor analysis, it is important not to have case of high multi-collinearity in order to be able to assign items to variables otherwise analysis will suffer from a lot of cross-loadings and you get correlated factors
It seems to be the case that your factors are correlated, and they will remain correlated no matter what you do. If somehow you manage to make them orthogonal, they may not be measuring the same construct anymore. My suggestion for a S-L transformation was to check whether items were more influenced by the general or by the specific factors. It might be the case that you will be able to extract those items that are only clearly influenced by their specific factors and no so much by the general one. Even then, however, you may not be able to achieve orthogonality or, if you do, you'll possibly be measuring only a specific aspect of the original construct. (For example, if you have items measuring anxiety and depression and you submit them to a S-L transformation, you may be left with items only related to physiological hyperarousal in the anxiety specific factor.)
As Wan has already suggested, consider using SEM for creating a model that includes both the correlation between your factors and any reasonable cross-loadings that you have.
The problem here is that you can have VIF values even under 3.3 (no multicollinearity), HTMT values under 0.90 (discriminant validity guaranteed, then, different constructs in your model) and Fornell-Larcker criterion ok (supporting again the discriminant validity). All these values show you can follow with your model. However, cross-loadings criteria is not met.
For this reason, some researchers tell you not to care about cross-loadings and only explore VIF and HTMT values.
# Aurelius arlitha Chandra...Check whether the issue of cross loading in that variable exist? If so try to remove that variable by checking the Cronbach's Alpha if Item Deleted. or Check communalities: less than 0.3? Remove the item.
Can anyone provide a reference of the idea that when an item loads on more than a single factor (cross-loading), such an item should be discarded if the difference in loadings is less than .2? I've read it on many statistics fora but would like to have a proper reference. Thank you.