Hi,

We sequenced genomes of ancestors/unevolved and evolved populations of a bacterial species. We would like to determine genomic variants (SNPs/small INDELs) of the evolved strains in comparison to the ancestor. We expect that the ancestor is clonal, while the evolved populatioin contains a mixture of different polymorphisms. The ancestor is >99% identical to the reference genome on NCBI. 

For variant calling I see 3 options:

1. Calling of variants based on the NCBI reference and removal of those variants that also occured in the ancestor.

2. Mapping of the variants from the ancestor to the NCBI reference, modify the reference according to the ancestor and use the result as new reference sequence for mapping of evolved populations.

3. Using a somatic variant calling tool to compare variants between the ancestor and the evolved population directly.

I think that the first two methods may be problematic, because they do only consider variants that were called positive. Information about low quality variants gets lost and little differences in predictions for the same variants may cause problems.

I was wondering if somatic variant calling could be a better alternative, because it directly considers the difference between two sequencing datasets. Information on application of this method is usually limited to eukaryotic datasets, mostly cancer vs. tumor cells. I did not find any information on the possibility of application for experimental evolution in bacterial datasets.

Does anyone has a suggestions for how to proceed and if somatic variant calling would be an appropriate method?

Best,

Chris

More Christian Woehle's questions See All
Similar questions and discussions