Best statistical analysis for an experimetnal vignette study ?

Claudia Negri-Ribalta @Claudia-Negri-Ribalta

30 July 2021 7 2K Report

Hello everyone

I have extensively read throughout the platform about the usage of different models to analyze likert scale and ordered dependent variables. I wanted to share my plans and see your opinions if it is the best model.

My context is the following: We asked how comfortable they would feel downloading an app with different characteristics (factors) from 1 - 11, 1 being under no circumstance I would download such app and 11 being I would download and use that app everyday. There are three factors,with 2, 3 and 7 levels accordingly (open-source, security and app provider). We deckerized with open source, as our previous research showed it wasn't significant, meaning that respondents were asked to evaluate a set of vignettes either from open-source or non-open source. We used clustered sampling and our sample data is representative of our objective population (with 600 answers).

I have read from sociological methodology that given that the likert scale is of 11 points (bringing a number of benefits) and it is set in an experimental manner, you can use ANOVA, OLS and Random Intercepts Models. However, I feel a bit uncomfortable using these, as some assumptions are broken. Thus, I decided to use an ordered logit regression (OLR) , as for me the dependent variable (willingness to download) is ordered. The parallel line assumptions isn't broken and all variable as significant, so that gives me confidence I can use this model. However, I started doubting if maybe a multinomial logistic regression.

I'm using R for the analysis, with the MASS packaged (specifically the polr function for the OLR and rant and poTest for checking the parallel assumptions). I have crossed checked I get the same results with STATA and it fits.

On the article I plan on also including the ANOVA, OLS and Random Intercepts Model to add robustness to the analysis. What's interesting is that, although some specific coefficients change from OLS to OLM, the conclusion are the same.

Thus: Should I used the multinomial logistic regression or not? Comments on what to report and improvements?

Edit PS: Through my ANOVA, it shows that the ind.var don't interact. Should I still include them in the OLR? Currently it is like dep.var ~ x1 + x2 . Would you suggest dep.var ~ x1 + x2 + x1:x2 as a better fit, even if the ANOVA with interaction says the interaction isn't significant? And if you think that the OLR should include the interaction, do you exactly know how to know if it is significant?

David L Morgan

The number of response categories that an item has does not affect whether it is ordinal-level or interval. So, if you have only the one Likert-scored item as your dependent variable, then you should non-parametric statistics.

Claudia Negri-Ribalta

David L Morgan I understand your point but from what I read about the methodology of experimental factorial surveys, particularly Auspurg and Hinz, 2015 they have consistently stated that given this specific type of methodology non-parametric and parametric test have thrown similar results. Indeed, Wallander, 2009, in his survey found that a considerable amount of studies using this methodology used parametric test

I thought about using Fisher but it doesn't give me enough conclusions. Mann-Whitney U was an option too, but then again it is only one way... would like to know more your opinion

Thom Baguley

I have a pre-print on this topic that might be useful:

https://osf.io/preprints/socarxiv/6n3zt/

I'm away from the computer for a few weeks from tomorrow, so probably won't be able to comment further for a while.

I think its irrelevant that Wallander found lots of people use parametric methods - if they are inappropriate for a particular study they are inappropriate regardless of popularity.

Personally I would treat the Likert-style response as ordinal. This may not be so much of an issue if all you care about is detecting effects as your sample is large, but treating ordinal scales as continuous can distort results in some cases. I think a bigger issue is how you have sampled vignettes. In a true experimental vignette study you have a large fraction of the vignette universe and a nested design. In practice allocation is often not purely nested. This means that analyses that ignore the nested structure have inflated Type I error rates. Even if the design is nested there are sometimes vignette features not part of the vignette dimensions that are ignored in analysis. Not taking them into account also potentially inflates Type I error rates.

Claudia Negri-Ribalta

Thom Baguley Brilliant! Thanks for the answer. I'll read your pre-print and your answer carefully

Salvatore S. Mangiafico

You are right to use ordinal regression. In general, you don't want to use multinomial logistic regression because it doesn't take into account that e.g. 2 > 1 and 3 > 2. There are some cases when it makes sense to treat ordinal data as nominal, but I doubt you would want to use it in this case.

I don’t think there’s any reason to also use OLS. I can see doing that analysis out of curiosity, but I doubt there’s any reason to present the results. Also note that if you are interested in including random effects, that the ordinal package in R allows you to formulate mixed effects models.

I don’t quite understand the question about interactions… Does polr not allow to include interactions? Are you using something like car::Anova to get an anova-like table?

Claudia Negri-Ribalta

Salvatore S. Mangiafico Thanks for the answer. Yes, in fact I wanted to use OLS out of curiosity and checking if it was ok. Indeed I have used ordinal for the mixed effects.

polr does allow to include interactions, I was wondering in general for the model. However, I have decided to go for Thom Baguley answer. Curiously both OLR and ML have shown very very similar results, which kind of leaves me confident that the models I'm testing are correct. Now I'm in the process of selecting the best ML

Are protein samples in SDS compatible with ELISA?

Do you have papers on hindlimb myology of birds to share?

Marker Set for 3D Motion Tracking of Upper Arm during Pointing Task?

Are there any way to make the comparison between different analysis of reliability possible?

How do I calculate variance explained by each predictor in a multiple regression via SPSS?

Does anyone has suggestions to extract carbohydrates from E. coli?

Is the statistical analysis performed correct?

Is there a validated english questionnaire to assess Achievement Motivation in Sport (Hope for Success and Fear of Failure)?

How to calculate a-priori sample size for a repeated measures 2x2 ANCOVA?

How should I present microRNA expression data from a single sample (leaves)?

How to compare two groups with only two measurements?

How to manually calculate p value in ANOVA? what is the equation or formula for it?

What kind of fluid could I use for a pitfall trap that also does not invalidate molecular testing?

Logistical regression using SPSS for a dichotomous dependent variable, with an independent variable of 1, 2, 5 and 10 seconds?

Do I need to do linear regression before running mediation analysis using PROCESS MACRO by Hayes ?

How to normalise data on OPM package from R Studio ?

How to calculate BREQ 3 questionnaire?

I am interested in medicinal chemistry field, which books are highly recommended to learn from?

Please suggest a good text book for "Research Methodology in Life Sciences". Which book would be good for PhD Scholars?

Can you share me how teacher education is managed in your respective national educational system?