I have analysed some pooled RAD-seq data, and there is a particular program I want to use but it requires (1) one population, and (2) only one chromosome, from a mpileup file.

This is a bit of a hassle given that with population genomics with reduced representation you analyse multiple populations and fragments together, so the mpileup file has multiple populations in it from many different loci. Extracting single chromosomes from an mpileup is possible because of the indexing in the file structure, but is it possible to pull out which reads belong to each population?

I have provided a small example for "chromosome" (in the case of RAD-seq, its a "fragment") E137_L96 at the first 3 bases. There are four populations in the sample.

Does anyone know if this is possible? I know technically .bam files have each population's mapping information, but that is not accepted in the program I want to use.

Alternatively, does anyone know any good, NGS software, that can estimate heterozygosity from pooled data? I know PoPoolation exists, but to my knowledge it doesn't estimate 'H'.

More Joshua A. Thia's questions See All
Similar questions and discussions