We currently aim to sequence human DNA with 90x coverage on a NovaSeq6000 system (TruSeq DNA PCR-Free Library Prep). The run parameters are all fine and with around 1.2 Tb the output should be sufficiently large (3 samples per flow cell). However, when we take a closer look at the coverage of our samples we can see that some regions of the genome are covered with up to 300-400x whereas a broad range of regions is covered with less than 40x (graphics attached). At the same time we have a very small percentage of duplicates (~6%). We are wondering where the selective amplification comes from?

Some details and stats for the alignment:

- Aligner: Isaac (Illumina, iSAAC-03.16.02.19)

- Reference genome: Homo sapiens (Ensembl GRCh37)

- Total aligned reads: ~2,000,000,000 (~93%)

- Fragment length: ~394bp

- Percent duplicate proper read pairs: ~6%

Many thanks in advance!

More Wencke Walter's questions See All
Similar questions and discussions