02 February 2013 8 10K Report

I am handling a few NGS datasets. While using samtools rmdup, sometimes it removes 10-15% of reads and sometimes 70-80% of reads. It is quite obvious that the second situation might have arisen due to wrong sequencing chemistry. But really I want to know, what is the basis of removing PCR duplicates? I have found some explanations on SEQanswers but they were not what I needed. Looking for your suggestions, and if possible also links about SAMTools rmdup.

More Sourav Nayak's questions See All
Similar questions and discussions