Dear all,

we have been running multiple ddRAD libraries prepared by following the protocol of Petersen et al 2014 in our lab on the HiSeq2500 (SR 100) in the past couple of years. We pooled either 48 individuals (using 48 inline-barcodes) or 96 individuals (48 inline-barcodes x 2 indices) and used 10% PhiX on the sequencing runs. This always produced nice data for us.

For our latest work, he had to move our sequencing runs to the NovaSeq6000 (SR 100, SP flowcell) as our sequencing provider stopped operating the HiSeq2500. For this we pooled 288 individuals (48 inline-barcodes x 6 indices). According to the Sequencing facility, again 10%PhiX was added.

When we received our data, approximately 30% of the reads got filtered out by the chastity filter due to poor read quality and another 30% of the data consisted only of polyG reads.

After some read-up we learnt that the polyG could be sequencing artefacts based on poor clustering which might be caused from low complexity areas in our libraries (the enzyme cut sites, since they are identical in all reads and potentially also the barcodes).

A peculiar detail of the output was that only 3% PhiX could be detected in the sequencing data.

The sequencing facility assured us that they added 10% PhiX and recommended to use 25% or 30% PhiX in our next run.

So my questions are:

1. Does anyone have experience with running ddRADseq or RADseq data on a NovaSeq? If so how much PhiX did you use for your run?

2. Did anyone experience this polyG problem in their sequencing runs?

3. If 10% PhiX really was added to the run, what phenomenon could cause a reduction down to 3% only in the output data?

Many thanks in advance!

Tamara

More Tamara Schenekar's questions See All
Similar questions and discussions