I performed two RAD-seq assemblies using ipyrad. I filtered the outputted .vcf file of each of the two assemblies using vcftools to limit the maximum quantity of missing data per locus to 60%. To one of the assemblies, was also imposed a minor allele frequency of 5%.

My results show me that the mean quantity of missing data per individual lowers more when the minor allele frequency threshold is nor imposed and so the data is only filtered for the quantity of missing data per locus.

I would like to know if someone knows of any paper which formulates a hypothesis for this finding since I have searched but I did not find anything regarding the impact of minor allele frequency filters on the quantity of missing data identified per individual.

Tank you in advance.

More Mariana Graça Mora's questions See All
Similar questions and discussions