Hello everyone,
I have performed amplicon sequencing (MiSeq, Illumina) of a 180 bp region in transgenic Zea mays. The target region is part of the tansgene. I have detected a SNP at the same position several times in different transgenic maize samples (from different varieties). The analyzed maize harbors the transgene in a hemizygous state. It means, that the transgene is located only on the maternal OR the paternal chromosome. Every DNA sample is from a single maize grain. Normally, one would expect only one possible allele for every single position in the transgene. But I identify frequently two different bases at the same position. The reference is a cytosine at this position, but I identify often a thymine as well.
Example:
Sample 1 at position 100: 50% C and 50% T
Sample 2 at position 100: 70% C and 30% T
Sample 3 at position 100: 90% C and 10% T
Sample 4 at position 100: 96% C and 4% T
Sample 5 at position 100: 99% C and 1% T
Sample 6 at position 100: 100% C
All together, I have sequenced more than fifty samples and I have detected different proportions of the mentioned SNP. As I already mentioned, every sample belongs to a single maize grain. From the genetic view, the only logically explanation for my findings are somatic mutations at position 100 or endoreplication. Has anyone another explanation?
The second problem: Are my findings real SNPs? I think the findings, where T is detected 30% or 50% are real. But What is with SNP frequencies below 5%? Where is the boarder between background and real SNPs? Are there any guidelines?
Additional information: Also Sanger sequencing results in SNP findings at the same position. Sequencing repetition of some samples shows (nearly) the same results.
Thanks for your help!!!