I am interested in seeing opinions about the use of single-items in practice and the results you got. Do you use single-item categorical variables in SEM?
Single indicator latent variables can be specified by fixing the (continuous) observed indicator's factor loading to 1 and fixing its error term to a value 'a' on basis of the indicator's variance and its (assumed) reliability: a = Var(x)*(1 - rho), where Var(x) is the variance and rho = reliability of the observed indicator (Brown, 2006, p. 139). If you assume your indicator has perfect reliability, a = 1. In Mplus syntax the specification is: f by x@1; x@a;
This specification will lead to a just identified model (or locally just identified part of a model). So, as long as your indicator has variance there will be no empirical problem. A challenge is that you need knowledge or at least an educated guess about the indicators reliability. Hayduk & Littvay (2012, DOI 10.1186/1471-2288-12-159) wrote an interesting paper promoting the use of single indicator LVs.
Hi Maria, i you will use a single-items measure you have to "fix" the "loading" and the "error" termin (for example to 1.0 and 0.0 - the case of a "perfect" reliable item). Without this constraints SEM models are not able to estimate this parameters because this part of the model is no identified. KG, Christian
There is a recent thread on "Latents with one indicator" on SEMNET that you might be interested in. See: https://listserv.ua.edu/cgi-bin/wa?A1=ind1404&L=semnet#38
Hi Maria, it depends on your dataset but ideally two or more indicators are needed to identify the measurement or CFA model. In my personal experience, one indicator has not worked. Sometimes two aren't enough either. You may get away with two variables providing there is enough information in the inputted matrix to identify the "just-identified model."
Single indicator latent variables can be specified by fixing the (continuous) observed indicator's factor loading to 1 and fixing its error term to a value 'a' on basis of the indicator's variance and its (assumed) reliability: a = Var(x)*(1 - rho), where Var(x) is the variance and rho = reliability of the observed indicator (Brown, 2006, p. 139). If you assume your indicator has perfect reliability, a = 1. In Mplus syntax the specification is: f by x@1; x@a;
This specification will lead to a just identified model (or locally just identified part of a model). So, as long as your indicator has variance there will be no empirical problem. A challenge is that you need knowledge or at least an educated guess about the indicators reliability. Hayduk & Littvay (2012, DOI 10.1186/1471-2288-12-159) wrote an interesting paper promoting the use of single indicator LVs.
Thank you all, your responses are great. Johannes, I think you meant a-0 if perfect reliability. The most common situations I saw were fixing the loading to 1 and the error term to 0. I believe this works especially for demographic variables, but usually not for behavioral ones. What do you think about categorical single-item?
Do you mean single-categorical variable as in only one categorical variables to be used in SEM? or do you have multiple single-categorical variables that are measuring the same construct? or are you planning to use single-categorical variable within the path model of SEM?
You can use ethnicity as a non-modifiable variable in your SEM or pathway analysis, where you may potentially adjust your model for it and look at mean difference between ethnic groups.
If you plan on using occupation in SEM then you have two options:
1) use occupation (being a socioeconomic status indicator) with other socioeconomic variables such as education and income to form a socioeconomic latent variable and use it in SEM.
2) You can have it as an independent variable in you path model depending on your hypothesis. Just remember to have a causal argument to back up your theory and consider the following:
- in what order are your variables related in?
- and does X (occupation) causes Y (satisfaction). if yes, then you can say that Y (satisfaction) does not come before Y (occupation). This does not necessarily mean that X will result in Y and there may be other variables before X that may influence both X and Y. That's why you have your variables in a particular order in your full SEM .
There is no right or wrong answer here, it all depends on your study and hypothesis. You can just as easily have a SEM or path analysis looking at socioeconomic variables only. Hope this helps.
you are right, a = 0 with perfect indicator reliability because it is the error variance. Thanks for pointing out my mistake.
I would say that assuming zero error is unreasonable for many variables, may they be demographic or behavioral. Nevertheless I would tend to treat occupation as a manifest variable, because it can be directly observed. (You may use occupation as an indicator for a latent construct such as socio-economic status, of course, as Mash Hamid points out.)
The approach explained above does not work with categorical variables (see here: http://www.statmodel.com/discussion/messages/11/1302.html?1266225270) because error variances are not model parameters in latent variable models with categorical indicators (see e.g. Brown, 2006).
Dear all, thanks for the interesting thread. I just wanted to know can we use two single-item measures in SEM? Would it be possible for your to provide any reference that says either 'yes' or 'no' to this question?
You mean two or more single indicator latent variables in a model? Yes, this is possible. There is no technical restriction. The article by Hayduk & Littvay (2012, DOI 10.1186/1471-2288-12-159) provides a general discussion.
Hi Johannes, thanks for your reply. Yes i meant two or more single indicator latent variable. For example I have a pay satisfaction which was measured with just one item and another variable which was also measured with just one item. Should I include both as latent or observed? If I add them as latent then I have to make the error variance = 0 as per above discussion in this thread. is that correct?
Hi Amina, modeling a single indicator as a manifest or as a latent variable will lead to identical model fit, regression coefficients etc. if you fix the residual variance at 0. Both specifications (probably unrealistically) assume perfect reliability of the indicator. Specifying a single indicator latent variable, however, has the advantage that you can fix the error variance to a nonzero value and, thus, account for its unreliability. Keep also in mind that specification reflects your theoretical assumptions about the variables. IMHO, pay satisfaction is not directly observable. Therefore it may be better specified as latent.
Hi Johannes, I just tried running my model in AMOS with a single indicator variable. If I use '0' as error variance only then the model runs. If I use 'a' or any other letter, the model is unidentified and notes says that I need to put one additional constraint. Could you please guide me how to deal with this or why this is happening? Thanks in advance.
Hi Amina, you need to fix the error variance to a specific value (e.g., .1). This value (wich I called "a") is calculated as the product of the indicator's variance and its assumed unreliability (i.e., 1 - reliability).
Please excuse my "dummy" question... Can the variance and rho that you use to calculate the value you should fix the error variance to simply be found by a descriptive analysis and a cronbach alpha calculation in SPSS? It seems so simple that it would be so...
I'd say that if this is all the information you have, you migth do that. It would be better, though, to have additional information from prior research or from an independent sample. Also keep in mind that Cronbach's alpha has some assumptions and its value may be biased if they are violated. If you work in a CFA context anyway, you might prefer composite reliability over alpha because it has less rigorous assumptions. See e.g. Raykov & Marcoulides' (2011) Psychometrics textbook on these issues.
Thank you for your reply! I am interested in using this approach on a structural equation model, where I want to test regression paths between latent variables. Because the model is quaite complex, I was intending on using this approach to create parcels that would be the sole representatives of each latent variable. I have checked my data and the variances are quite large, making the error variances algo quite large (i.e., up to 6...). So, my model is not converging... Do you have any suggestions on how best to adress this?
Can't say anything about the large variances, but it might be helpful to start with testing the measurement model of each latent variable separately. That way you will see the relations of the items and their respective variables and whether something is wrong on this level.
I have a similar problem with SEM using Lisrel. I am using a Lisrel for my SEM modelling. All my variables are ordinal. Hence the indicator variables for the independent latent variables (intention) and the observed dependent variable (behaviour) are all ordinal. Now if I try to define an observed variable as a dependent variable to the latent variable, it assumes it to be another indicator variable to the latent variable. In order to solve this problem, I have tried using a single indicator latent variable. Hence I create a latent variable where behaviour is a single indicator and this latent variable then becomes the dependent variable to the latent variable intention remains the independent variable. The model works but reading the above argument I have doubts on its reliability. So I have two questions:
1) In Lisrel, how can I treat an observed variable - behaviour, as a dependent variable to a latent variable without it being mistaken as another indicator variable of the independent latent variable intention
2) If this is not possible then can I use a single indicator latent variable to define my dependent variable
A single-item variable represented as a rectangle in case is an exogenous or indigenous. Sometimes, we deal with categorical variables as moderators. In this case you would be testing model invariance (multigroup analysis)...etc.
Hell, In my research model I have a single item latent variable, can anyone suggest how to handle single item construct (as dependent variable) in SEM (in AMOS)?
I am involved in a transcultural study and currently I have a lot of doubts related to my CFA model in Mplus. I'm running a model using a single item to describe one dimension, however I think that I should adjust the item in my sintaxis defining the formula noted by Johannes Bauer .
(ie.
F1 by mars14@1; mars14@0;
F2 by Mars15-mars18;
F3 by F2 F1;).
I appreciate your help or some additional article that I can read.