Many people have noticed sequence content bias at the 5' end of Illumina RNA-Seq reads attributed to random hexamers used for priming. There has been a proposal to correct coverage for this bias, and some people clip the first 13 bases from 5' if their reads are long enough. Is clipping actually useful or does it just remove valuable sequence data?

Similar questions and discussions