I have data collected from a pre/post intervention study with mutiple dependent variables. The depenent variables are interval type. The data was collected 'in the wild', and so there are some issues that are hampering the statistical testing I would like to conduct. The issues are:

  • The number of samples in the 'pre' phase of the study is smaller that in the 'post' phase. In the pre phase, subjects were measured daily for 14 days, whilst in the post phase they were measureed for 21 days;
  • Some of the daily measurements were missed or overlooked. As such, the size of the pre and post data sets is not the same accross all subjects.

Had I not encountered these issues, I would have planned to conduct a One-Way MANOVA with repeated measures to perform significance and interaction testing on the dependent variables. However, the issues outlined above make this difficult.

I have two possible solutions that I can think of:

  • I could find the set of pre or post subject data with the smallest number of samples. I could then randomly extract this number of samples from each subject's pre and post data sets to form a balanced, but smaller set of data. I could then perform the One-Way MANOVA with repeated measures. However, the limitation here is that I don't make use of all of the samples collected (the ones not chosen at random to be included).
  • OR, I calculate a mean value for each subject's dependent variable in both the pre and post phase of the study using all of the samples. This would then provide me with a pair of values (pre/post) for each dependent variable. I could then perform the One-Way MANOVA with repeated measures. However, the limitation now is that I have reduced my samples down to a mean value, which might be skewed by the variation in the number of samples contained in each set.
  • I would love to hear your views as to whether solution 1 or 2 is better, or if there is a more effective way to deal with this situation.

    Similar questions and discussions