I would contact the biostats group at M D Anderson. Years ago we had a program they provided that gave you those calculations. Dr Barry Brown was involved, and if he is still there, I'm sure would walk you through it.
It may sound difficult but i guess it is easy, you need to know the prevalence of the disease, population of the community and you the common formula or go to the website http://www.statisticalsolutions.net/pss_calc.php for further help :)
I had the same problem. I found the following links really useful:
Muller et al. 1984: http://ukpmc.ac.uk/abstract/MED/6542184/reload=0;jsessionid=YG54kPaNnmNyyjqmUQSr.0
Testing et al. 1998: http://www.ucd.ie/vavctest/kierantest/biomedical/html/alternatives-to-animals/Reducing%20the%20use%20of%20Laboratory%20Animals%20in%20Biomedical%20Research.pdf
Clear and well written article you find attached I think it wil be helpfull to understand basic problem of power calculation and sample size!... Enjoy reading
For prevalence study use this formula to calculate sample size:
n= Z2 P (1-P)
d2
(it is Z square and d square and not Z x 2 or d x 2)
Where n= sample size
Z= Z statistic for a level of confidence
P= expected prevalence or proportion
(in proportion of one: if 20%, P=0.2)
d= precision (in proportion of one; if 5%, d=0.05)
For 95% level of confidence which is conventional, Z value is 1.96.
For sample size calculation in diseased compared to normal i.e. in gene expression studies in normal and in disease, you can calculate the sample size by taking power of the test at 80% (beta=0.8, alfa=0.05, SD=1.96 and mu(0)=1 expected in normal (eg. while performing relative gene expression in normal which is taken as one) and expected mean in disease mu(1)=2 (at two-fold increase in gene expression the study will be able to document with 95% confidence level). At this parameter, the sample size will be 31 cases for a two sided test. (this is just an example) I hope it will be of help to you...
I am attaching here a Excel file which I received from Helmut Schutz. It will provide you assistance in estimation of sample size for a clinical study.
This is covered by Claudia Beleites in her paper on 'Sample size planning for classification models': http://dx.doi.org/10.1016/j.aca.2012.11.007 also on arXiv at http://arxiv.org/abs/1211.1323 The slides from her presentation at the CLIRSPEC Conference in Exeter 2015 (http://clirspec.org/conference/) are also available from the arXiv page.
Many years ago the Dept of Biostats at MD Anderson had a plug in program (authored by Barry Brown, PhD) on their web site. It was free and easy to use. Maybe it is still there.
this will depend on the topic of the clinical study.
If it is about validating some biomarker (or other things that come in metric measures such as concentrations), you start from guesstimates of effect size and the expected patient-to-patient variation.
If it is about some diagnostic classifier, i.e. about recognizing categories, where e.g. sensitivity or specificity or predictive values are to be reported, our paper mentioned by Alex shows how to calculate the number of patients necessary to We found that for less than several hundred patients, the validation part is the bottleneck rather than the training.
measure sensitivity & Co. with a given precision
show that the classifier is better than a prespecified threshold performance
compare the classifier to e.g. previously reported performance.
The supplementary material includes R code to do these calculations. Note that for these scenarios, Gaussian approximations typically do not work well and underestimate the necessary number of patients.
Feel free to contact me directly if you have further questions.
to do power analysis to estimate your sample size, you have to write your hypothesis, and based on that you decide what statistical test you will use. It should be one of the inferential statistics. so you need to determine the following: alpha {standard to be .05}, power [standard to be .80], effect size {small, moderate, or large, each test has its own value, you can find these values in the net}. Then download free programs to calculate the sample size such as G. power.