I am proposing the cohort that will search for a predictive biomarker for treatment response. I used the formula using 95% CI and 0.05% error. And I used p as the response rate from the previous study. Any suggestions?
It is difficult to answer your question directly because it is not clear what statistic you are using. If p is a binary outcome, then a biomarker will be useful if it adds to what is already known to predict p. This may be demographics or pre-existing conditions etc. If, for example a logistic regression model with age, sex, and pre-existing conditions has an AUC of 0.7 then a useful biomarker would be one that increases this AUC (or other measure of discrimination). What would be useful 0.75? 0.8? This will help determine the sample size needed. Once you have a number you will need to increase it because of "drop-outs" (eg people withdrawing from the study or a failure to collect all relevant data). You can use the previous study to estimate what the rate of drop-outs will be.
I am guessing that there are not many options for the type of test. If you have two treatment groups use ANOVA (1 df in numerator) or t-test (should give same probability as ANOVA with 1 df in numerator). With more than two treatment groups use ANOVA. What is missing in your question is any information about the magnitude of the difference(s) between treatment group means. Larger differences between treatment group means give rise to larger t values or F statistics. A simple approach would be to visually scan a t-table or an F-table. Use the DF column in the t-table or denominator degrees of freedom in the F-table (for the given numerator degrees of freedom) to find minimal required differences between treatment group means for any appropriate level of statistical significance. Being able to read the t-tables or F-tables with correct interpretations, in mind, is a useful skill for experimental design.
John W Pickering Thank u. I am doing a retrospective cohort, collecting patient specimens and clinical outcomes. The statistical test would be the hazard ratios of each predictor that I measure.