Hi Sunny, the answer is 'it depends'. If your dichotomous or categorical variables are exogenous (independent variables) then the binary variables have a clear interpretation, the slope tells you the difference in the means between the levels of the variable. Similarly categorical variables can be recoded into dummy variables and their interpretation is the same as when using binary variables. If you Google 'dummy variables regression' you'll get loads of helpful information and videos.
If your dichotomous or categorical variables are endogenous (dependent variables) then things are a little different. You'll not longer we working with a linear regression model because you'll want to find out the probability of people being in a particular category/level rather than the linear association between variables. with binary/categorical dependent variables you need to use binary logistic regression or multinomial regression (or some variation of these). This may be harder to do in a SEM model as not all SEM programs will accommodate categorical DVs. I don't think AMOS or LISREL can handle these types of model but Mplus will do this type of analysis very easily.
Hi Sunny, the answer is 'it depends'. If your dichotomous or categorical variables are exogenous (independent variables) then the binary variables have a clear interpretation, the slope tells you the difference in the means between the levels of the variable. Similarly categorical variables can be recoded into dummy variables and their interpretation is the same as when using binary variables. If you Google 'dummy variables regression' you'll get loads of helpful information and videos.
If your dichotomous or categorical variables are endogenous (dependent variables) then things are a little different. You'll not longer we working with a linear regression model because you'll want to find out the probability of people being in a particular category/level rather than the linear association between variables. with binary/categorical dependent variables you need to use binary logistic regression or multinomial regression (or some variation of these). This may be harder to do in a SEM model as not all SEM programs will accommodate categorical DVs. I don't think AMOS or LISREL can handle these types of model but Mplus will do this type of analysis very easily.
Thanks, Mark! Very informative and helpful. The variables are for my endogenous construct. It is not the construct by itself, but the item measurements, instead of a Likert-type 7 point scale, I have a dichotomous yes, no response for the measures (reflectively measured endogenous construct). The DV itself is thus not categorical.
I can not recommend best practice specifically for management scholars, but best practice with dichotomous indicators in SEM is to use a WLSMV estimator and treat the indicators as categorical. Alternatively, you can also consider Rasch modelling, although WLSMV with logit as link function and theta parameterization will yield essentially the same results.
Hi Georg, are there any helpful textbooks or articles that explain why WLSMV is better suited for dichotomous indicators? In the meantime, I am reading up on the Rasch Model :) Thanks!
The simple answer is, because all assumptions for the ML estimators are usually violated by categorial/dichotomous variables; in theory and in practice. If you want to read up, I would suggest Lei (2009), and Savalei and Rhemtulla (2013).
Lei, P. W. (2009). Evaluating estimation methods for ordinal data in structural equation modeling. Quality and Quantity, 43, 495-507.
Savalei, V., & Rhemtulla, M. (2013). The performance of robust test statistics with categorical data. British Journal of Mathematical and Statistical Psychology, 66, 201-223.
as Georg suggests: WLSMV is good choice if you use Mplus.
Finney and DiStefano (2006) provide also a good overview of the topic.
Finney, S. J., & DiStefano, C. (2006). Non-normal and categorical data in structural equation modeling. In G.R. Hancock & R.O. Mueller (eds), Structural equation modeling: A second course, (p. 269-314). Greenwich: Information Age Publishers.
But I just want to add something to Georg's answer (with focus on ordinal data):
May be the answer isn't that simple. Because in the 90ies and also in first decade of the new millenium the asymtotic distribution free (ADF in AMOS, WLS in LISREL and AGLS in EQS) estimation was recommended for nonnormal (including ordinal) data (because there are no assumptions about the distribution). But simulations have shown that ML outperformes ADF (especially for "small samples". ADF needs at least N > 1000), even when ML assumptions are violated (whereas ADF assumptions on the distributional form are not violated). And there are also recent examples where under some conditions ML is recommended for ordinal data (see Rhemtulla et al., 2012).
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological methods, 17(3), 354.
Hi Georg, yes robust ML performs better regarding SE-bias and fit-bias, but the parameter estimates for robust ML are equal to standard ML (only SE and Chi² are corrected). And actually there are some situations where also standard ML performs better. E.g. Sass et al. (2014) show that for invariance testing based on deltaCFI/TLI... ML performs better than WLSMV.
intranational greetings :-)
Sass, D. A., Schmitt, T. A., & Marsh, H. W. (2014). Evaluating model fit with ordered categorical data within a measurement invariance framework: A comparison of estimators. Structural Equation Modeling: A Multidisciplinary Journal, 21(2), 167-180.