Hi,

I have a pretty basic question I think but am going round in circles with it.

I have relatively large dataset (2500+ cases). I'm working with categorical predictors and want to carry out numerous logistic regressions. Some IVs have many missing data responses (secondary data source - variables such as 'vulnerable' where 1 means that the variable was present, 0 absent, and missing data may mean that the variable is absent or that we just don't know if it's present).

I've looked into dealing with missing data in logistic regression but what I'm reading doesn't seem to meet my purpose. I don't want to carry out listwise deletion of cases. I'm wondering whether there are any rules around which variables to exclude from my analysis (due to a certain % of missing data) and if anyone could point me towards a good resource?

Thanks,

Freya

More Freya O'Brien's questions See All
Similar questions and discussions