Consider following case:
There are k different groups, but only the mean difference between 2 of these groups should be tested (t-test, consider the assumptions are met and alpha and beta are given). The data for the other k-2 groups is collected for exploratory analysis, and so it is available to get a better estimate of the variance within the groups and hence a better estimate of the standard error of the interesting mean difference.
How can this extra-information (-> degrees of freedom) be considered in the determination of the sample size for the two groups to be tested?
.
Just to illustrate it by example:
Given one have only 2 groups, the sample size can be calculated (in R) with
power.t.test(delta=1, sd=1, sig.level=0.05, power=0.8)
what gives n=17 per group.
Now consider that there are additional data (other groups) from which the (pooled) variance could be estimated. Given there is a total of m = 40 additional values in 4 additional groups, I would expect to need less than 17 values per group for the test if the pooled variance is estimated from all 2n+40 values in 2+4 groups. But how to calculate this sample size?