Dear all,
I would like to use the software BayesAss (Wilson and Rannala, 2003) to estimate contemporary gene flow between several populations sampled from a river network (fish species).
I am working with 12 sampling sites and ~150 biallelic loci (SNPs) and typically have low to moderate population structure between sites (pairwise Fst ranges from 0.005 to 0.05). I also have different species at disposal.
Therefore I am aware BayesAss might have difficulties to properly estimate geneflow between sites, considering the software typically deals better with moderate to strong geneflow. Also, BayesAss results also highly depends on the number of sampled individuals per sampling site, so I tried two different configurations:
Config. 1: keeping the 12 population separated (mean per-site N : 15 individuals)
Config. 2: grouping sampling sites: 4 headwaters from different tributaries + 1 downstream area (mean per-group N : 70 individuals)
I am usually performing 3 to 4 runs with different sampling seeds, and check convergence between runs by observing the evolution of LogProb and the bayesian deviance; as advised by Meirmans et al. (2014, doi: 10.1111/1755-0998.12216).
> With config. 1, I usually have very high acceptance rates, outside the ideal range of 20-60%. However, there is a good convergence between LogProb from different runs and the bayesian deviance is rather similar between runs. What seems a bit odd to me is that migration rates between populations (as the proportion of individuals of pop. [i] immigrating from pop. [j]) are all drawn from the same population, that is, a given population [j] (always the same within and across runs) seems to provide migrants to all other populations, as if this was a source and all other pops were sinks (you may prefer seeing the attached output file). All other pairwise migration rates are not significantly different from zero, which is in contradiction with previously computed pairwise-Fst calculations (but might be due to the fact the number of sampled individuals is just too small for BayesAss being able to label some of them as migrants from another of our sampling sites).
In some other case (another species), convergence of LogProb is also good but not the one of migration rates (there is still a 'source' population but this time; its identity changes across runs).
I would instinctively say that the exact values of migration rates are to use with very high caution, however, the main feature I am finally interested in is knowing if I do have population(s) behaving like source(s), and which one(s). So finally my question is:
Do you think I could trust the results from BayesAss here, and consider that :
- If my 'source' population is the same across runs, this should at least tell me that this population is actually working as the strongest migrants provider, even though other populations might also exchange migrants but in a way that is too small for BayesAss to detect?
- If my 'source' population changes across runs, can I conclude in this case that there is no actual source population and all of them might be exchanging migrants in a similar way (still runs are still converging, and even though I do have 'null' migration rates within a given run)?
Should I not trust any of this??
> Results from config. 2 behave the same way in spite of having more individuals per grouped populations. The only differences are that acceptance rates are lower (within the ideal range of 20-60%) and there is slightly more variability between LogProb values across iterations; which seems quite logical since there should me more intrapopulation variability too.
Thank you very much for any suggestion! I am very sorry for this long detailed post (I hope this will also be of interest for other students!)
Best regards