Let's consider the (standard) 2-sample t-test for a difference in means. The t-value is calculated from empirical difference in sample means divided by the SE of this difference (in turn derived from a pooled variance estimate). Therefore, by construction, the test is sensitive to differences in location, but not in dispersion. The p-value is calculated from a t-distribution (with n1+n2-2 d.f.).

As I understand, this t-distribution is derived for one single condition: the data from both groups is sampled from *one* population with normal distribution. The normal distribution if often called an "assumption", and the fact that both samples are from the same population is called the "null hypothesis" (H0).

Is there any reasonable argument to assume that under H0(!) the two samples are coming from two different populations, with possibly different variances but the same means?

My question has two aspects:

I wonder if such a test is justified when we believe/know that the samples must have been taken from different populations (that might have the same mean or not, but they surely have different variances). What is the rationale behind judgeing differences in mean values when I seem to compare apples and peaches anyway? (Just as a side note: often the difference in variances under H0 can be explained by inhomogeneities within one of the groups; this is for instance often observed in studies with diseased and control animals, where the diseased group suffers several side-effects increasing the variability of the response, but not neccesarily the mean. - Wouldn't it be more appropriate, if possible, to adjust for these indirect effects instead of simply "assuming different variances"?). (and I know that if we ignore all these logical things there is a Welch-correction)

The second aspect is: Since the p-value related specifically to H0, only the conditions under H0 are relevant. Right? Again a typical example from biomedical research: mean and variance of concentrations are usually correlated; the higher the observed mean, the higher the observed variability. I know that a log-normal or gamma-glm with log link is most appropriate here for analysis, but here I am asking for a simple t-test again(considering the violation of the normal-distribution assumption is negligible!). The observed differences in variances are related to different (sample)means. And under H0 (!) I think it is justified to assume equal variances (otherwise see the previous paragraph). Having said this it follows that the t-test (without Welch-adjustment) would be perfectly fine, although the sample data has apparently very different variances in the groups. The problem might be to get a good estimate for the SE. Using a pooled estimate might result in unneccesarily low power, but nothing could be done so wrong to accidentally inflate the type-I error rate. Right?

The same questions apply to the "non-parametric" alternative, the Wilcoxon test. The p-value here is again derived from under the assumption that both samples are taken from the *same* population (that dosen't need to have a normal distribution). It is often stated that this is a test of location-shift (equality of the medians), if and only if all other other moments of the distributions are the same. Again I wonder if H0 does not automatically and necessarily imply hat all moments must be identical, since there is only one distribution under H0.

More Jochen Wilhelm's questions See All
Similar questions and discussions