I'm looking for a formula that calculates minimum coverage needed to call a variant allele with a certain condifence level, given expected variant allele frequency (VAF) (or percentage of mutants at a given position in the sample) and expected sequencing error rate, assuming it's uniform. It's not about estimating VAF itself, it's about stating the presence or absence of the allele in question. I need a formula based on statistics, not a practical estimate. I suppose one needs to build a confidence interval on the number of Bernulli trials or something like that. I'm willing to work that out myself but I figured someone might already have and if so, finding that solution could be faster. After googling briefly I found some rule-of thumb recommendations, including at https://eu.idtdna.com/pages/education/decoded/article/how-important-are-those-ngs-metrics#coverage_depth and in a good review Article Understanding the Basics of NGS: From Mechanism to Variant Calling

although it only focuses on the 50% VAF case. Another article reports simulation results: https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=3873500_gr1.jpg

UPD (Oct 31, 2019):

Article Standardization of Sequencing Coverage Depth in NGS: Recomme...

More Maxim Dukov's questions See All
Similar questions and discussions