Help appreciated!! We are having Empirical data sets of plant barcodes (matK, ITS and rbcL), and want to compare analytical methodology used for Empirical data sets with the Synthetic data sets.
How can we generate such kind of Synthetic datasets? I have found relevant papers but was not able to understand methodology. I would appreciate if any one could provide detailed procedure or provide any tutorial.
REFERENCE: Supervised DNA barcodes paper:
Article Supervised DNA Barcodes species classification: Analysis, co...
From paper....
Synthetic data:
Real DNA Barcode datasets are simulated with Coalescent package in Mesquite version 2.75 (see the related work [8]). The data are simulated considering time of species divergence and the effective population size (Ne), i.e., the number of individuals in a population (of a species) that are contributing genes to the succeeding generations.
Another similar paper...
Conference Paper Species Identification using DNA Barcode Sequences through S...