Hi helpful people,
I have paired-end sequence data for pooled samples which were library prepped with a custom inline barcoded P5 adapter. So the first 6 bases of the R1 read should contain the barcode for a sample. And it does, which is great. BUT for some samples up to 5% of the paired R2 reads also contain the barcode in the first 6 bases!
So for up to 5% of read pairs we have:
R1 read id 3001:
[barcode] [sequence............]
R2 read id 3001:
[barcode] [sequence............]
Of course we would expect by chance that some of the R2 reads would have their first 6 bases the same as the corresponding barcode. However not 5% !!
Any ideas what could cause the barcode to sometimes appear at the start of both R1 and R2 when only one barcoded adapter was used in library prep?