10 December 2020 29 9K Report

I have a dataset with one nominal independent variable with 10 different levels and one dichotomous dependent variable.

What would be the appropriate statistical test to compare the different levels of the IV?

Example:

Independent variable: "Favourite color". Different non-ranked levels: Yellow, green, orange, blue, red, purple, black, white, brown and pink.

Dependent variable: Dichotomous: Smoker: Yes (1) vs. No (0)

I am not interested in choosing a reference level (in this example a specific colour) since there is no solid way to decide which of these colours should be the reference.

The only idea I can come up with for statistical testing is chi square (Fishers) comparing each level of the IV to the combination of all the other levels. In other words creating dummy variables (but without a reference level) - e.g. "Yellow vs. not yellow" and then perform chi square. Next "green vs. not green" etc. till the end (with all the levels).

Is this an accepted way to compare the different levels of a nominal variable?

My results will then be something like shown here:

"People with favourite colour yellow smoke significantly more than others".

"People with favourite colours green, orange, blue, red, purple, black, white and brown does not smoke significantly more (or less) than others".

"People with favourite colour pink smoke significantly less than others".

This analysis is easy to perform but is it statistically sound?

Are there any better alternatives?

Or should I simply stick to descriptives without any statistical comparison?

Thank you

More Anders Dahl's questions See All
Similar questions and discussions