For the moment, I am using unpaired t test with Welch’s correction in presence of unequal variance between groups. Of note, most of the variables has equal variance. Many thanks !
The "best" way to statistically analyze your data does NOT depend on the sample size but on two other things:
- your scientific question, and how you formulate this in a statistical model
- the kind or nature of your data, and how you represent it by a random variable (this includes the distributional assumptions that can reasonably be associated with the data)
As you say you used a t-test, I assume that your problem is to test an expected difference between two random variables, because this is the only analytical tool for this task.
If you had different variances in the groups, the question arises: why? How can the expected difference be interpreted when it refers to diffenent kinds of random variables? Often, heteroskedasticity implies that the distribution of the random variable is also not normal (what is an assumption made by the t-test) and that the entire statistical model is inappropriate (e.g. your data are counts and a Poisson-model would be appropriate, or your data are proportions and a beta-model would be in place etc.).
Note also that, if you say « most of the variables has equal variance », it suggests you have a lot of variables available to compare your groups.
According to your exact question, this may either ask for multiplicity corrections, or for the use of multivariate techniques.
As wrote Jochen, without knowing the question and the experimental protocol, there is no way to answer your question. There is no universal "best" method.
If your data is not normally distributed, an alternative nonparametric test is the Mann-Whitney U test. This also works with unequal samples sizes. You sample sizes are rather low, by the way, but their appropriateness would depend upon the size of the effect(s) you are measuring. Large effects require smaller sample sizes to detect them.
that's not that simple. The U-test tests a different hypothesis than the t-test. It should be given by the scientific question what kind of hypothesis the researcher wants to test.
Thank you for the follow-up guys. The question is rather simple in that case. I want to examine the impact of cardiorespiratory fitness on a specific determinant of blood flow regulation . For that, I am comparing athletes (n=17) to controls (n=8). The distribution of the random variables is normal for all variables. The variance is equal for most of the variables...
In this case of your data ( normal distribution and equal variance ), the T-test for two independent samples will be suitable to test the null-hypothesis for equality of the two means .
According to your question, you're interested in « a specific determinant of blood flow regulation », which suggest you have a single variable. However, you say afterthat « for all variables » and « most of the variables », which suggests you have many. There is apparently a self-contradiction here.
In addition, you do not describe what you expect for the CR fitness effect on this determinant, and what you're interested in. Any change is welcome => Mann-Whitney test is OK? Change in location => question may be solved by T-tests, or not, and it may turn out to be a very difficult question, especially with small samples sizes, if you want to detect changes in location even in the case of different variances or distributions...
Please explain better you experimental plan if you want a clear answer on what you should do…
In the most common case of a single variable compared between two groups, assuming a Gaussian distribution and equal variances (so that the only difference between the two groups can be on means), unequal sample size may not be a problem for the T-test in principle, but will lower the power & make the test less robust to any departure to its assumptions.
The Welch–Satterthwaite approach is probably good enough to test for differences between means. It is distributed approximately as t with a degrees of freedom correction. Many computer programs implement some variation of this test for differences in means when the assumption of equal variances (observations drawn from the SAME population and the only effect is on the means) is not tenable. I would prefer to stay away from non-specific omnibus tests that don't involve the estimation and testing of parameter estimates (non-parametric) such as the Mann-Whitney U test when possible.
If you have unstable variances, then I would worry about whether the very small sample sizes and possibly large variances might not yield enough power to detect true differences or be appropriate to pooling for effect size estimates..
When comparing groups, however, please don't stop with just comparing means (or some other estimate of location, such as the median, trimean, winsorized mean, etc.) What is the difference in spread (variance, range, midspread, etc.) You might want to explore why the groups differ in spread. Next look at shape. Are the distributions of the two samples skewed, normal, Cauchy or some other shape? What about peakedness? Is the kurtosis the same or not? Are there unusual observations such as outliers (points sampled from a distribution other than what you think you've sampled from) or extreme but legitimate data points? These are the main ways that two samples may show differences the the populations from which they are drawn.
If your data is cross-sectional and fulfills the assumption of normality, independent two sample T-test will be fine. However, if you have doubt with normality assumption and constant variance assumption for error term, the non-parametric analog of T-test which is known as Mann-Whitney test will be preferable. In case your data is normal and is repeatedly measured/longitudinal you can use linear mixed effect model.
@ Yang Li: 1) with so few data, seems difficult and 2) to run a mixed effects model, you need repetitions on the same patient/experimental unit which was not implied by the design described in the question.
In addition to all above advices. I would suggest to try a normal plot on the normal paper for each sample separately, and for the united sample of all 18+7=25. Possibly, these plots will reveal or confirm your assumptions.