I have data from a small psychological treatment clinical trial with ~60 participants followed over three time-points: baseline, post-treatment, and long-term follow-up. There are several baseline characteristics including clinical parameters such as height and weight as well as demographic characteristics. There is a continuous outcome variable for the primary outcome. In addition, for secondary outcomes there are several large-scale questionnaires ranging from 12 to 92 items, each with total scores and in some cases up to 8 sub-scores. The primary and secondary outcomes are collected at each of the three study time-points.

At study completion there is sporadic missing data in the questionnaires for several participants, as well as ~15 drop outs who are missing all outcomes data after the baseline time-point. I was considering a multiple imputation method to run a full intention-to-treat analysis, but my understanding is that with many more variables than observations (easily several hundred variables to 60 observations when considering all the questionnaire items over three time points in wide format) multiple imputation would either fail or produce unreliable results. Even if the questionnaires were to be scored for complete cases and multiple imputation were to be run at a score level for the questionnaires rather than at an item level, the number of variables would still likely be higher than the number of observations given the number of sub-scores, baseline variables, primary outcome, and the three time-points collected.

My questions at this point are:

1. Am I correct that there is no reasonable multiple imputation method for this data set?

2. If I need to do an ITT analysis, which single imputation method (e.g. last-observation carried forward, mean imputation) would be the least problematic?

Thank you!

More Michael Artin's questions See All
Similar questions and discussions