I am planning to conduct a genetic diversity & population structure study in African zebu cattle. I will use 77k SNP markers to genotype the population. What would be your though about ideal sample size?
Front. Genet., 10 October 2018 | https://doi.org/10.3389/fgene.2018.00438
Samples were genotyped at Geneseek (Neogen Corporation, Nebraska, United States) using the Geneseek Genomic Profiler (GGP) High Density (HD) SNP array consisting of 150,000 SNPs, while SNPs for the reference breeds had been genotyped with the Illumina HD Bovine (777K SNPs) array. The SNPs in GGP array were optimised for use in dairy cattle having the most informative SNPs from Illumina Bovine 50k and 770k chips and additional variants known to have a large effect on disease susceptibility and performance. Genotype data quality control and cheques were carried out using PLINK v 1.9 (Purcell et al., 2007) and included removal of SNPs with less than 90% call rate, less than 5% minor allele frequency (MAF) and samples with more than 10% missing genotypes. Additional removal of SNPs not mapped to any chromosome left a total of 120,591 SNPs for analysis. Of the 299 animals, 12 failed the above outlined quality cheques and were removed from the analysis. Total genotyping rate in remaining samples was 0.991. The 120,591 SNPs used in the analysis covered 2516.25 Mb with an average distance of 22.67 kb between adjacent SNPs. The mean chromosomal length ranged between 42.8 Mb on BTA 25 and 158.86 Mb on BTA 1. The mean length of adjacent SNPs per chromosome ranged between 18.67 and 23.89 kb on BTA 14 and BTA 29, respectively. The linkage disequilibrium (LD) across the genome averaged 0.41. Private alleles, defined as variants which are segregating in only one population when evaluating multiple populations, were identified using a custom script in R. A total of 143 private variants, most (132) of which originated from the Rwanda cattle population were detected.