I have to compare the mean levels of a continuos variable y (ranging 1-20) subdividing my sample in two groups according to a dichotomous variables (i.e gender). Sample sizes of the two groups are unequal (m=20/f=80).

It happened that in the male (smaller) group, all subjects have y=1, while in the female group scores ranges along all possible scores, although the distribution is not normal and highly skewed towards the lower scores. As of these, the male subgroup has no variance in y while the female group shows it.

Considering these issues, I thought to use bootstrap on t-test to make a more reliable mean comparison between the two groups: 

I first applied bootstrap, stratified for gender, with (welch) t-test. 

I then tried a wild bootstrap approach with the same t-test. As a matter of fact, I can consider this also a special regression with a single dichotomous predictor, with an extreme hetehroscedasticy and non-normality of residuals and in regression with heteroscedasticy the wild bootstrap approach is usually recommended.

What relevantly differs between the two bootstrap strategies is that with a wild approach I’m bootstrapping residuals from one group to subjects of another group. Results are also very different: wild bootstrap provides much much smaller p/CI than the stratified bootstrap with welch correction.

My questions are:

1 is a boostrapped t-test valid in this situation? And if so,

2 what of the two bootstrap approach is the most correct in this situation?

3 in case I would add some covariates (i.e. ANCOVA with one dichotomous predictor and two continuos predictors), what is again the most correct bootstrap strategy?

Many thanks for your help!

More Massimiliano Grassi's questions See All
Similar questions and discussions