06 February 2021 5 4K Report

I have a dataset from a questionnaire that consists of a dependent variable called skill level. This is an ordinal variable that has dummy coded values 1-5. I also have multiple (up to 10) independent variables that have dummy coded binary values 0 or 1. The independent variables are not ordinal, meaning that 0 does not mean "less" or "worse" than 1. The variables were simply statements in the questionnaire that the respondents chose to be true or not for them. It's also not inherently "better" that a respondent chose all or none of the statements to be true for them.

I would like to study which independent variables are the most important and affect the dependent variable the most. So I'd like to find a model or a set of analysis techniques that would find correlations for IVs and skill level. If possible, I would like to get the significance values also for each independent variable. My dataset has around 350 data points, but the data points are not equally distributed per skill level. There are less data points for the highest and lowest skill levels than for the middle skill levels. The data is also not normally distributed.

I tried ordinal regression but it did not perform well. I think it might be because the variance in the data values is large (only 0 or 1). In some cases, it even showed p-values near 1 for some variables that had Spearman correlation p-value 0.00, in addition to having the sign of the coefficient (+/-) being the opposite one.

Could you help me find a technique to study the data and find the most "important" variables in it? As for software, I'm most familiar with Python but I can get help with SPSS and R if there are packages or techniques that are limited to those. Thank you!

ps. I was thinking of doing a logistic regression for each independent variable and skill level separately, but I'm not sure if it's a good idea. I don't know if there is a statistically satisfying way to compare the results of the logistic regression models. Please tell me if you have any idea on how that could be done and interpreted.

More Timo Ijas's questions See All
Similar questions and discussions