I am trying to infer both maximum likelihood and bayesian phylogenetic trees using RADseq reads. I have already created my sets of radtags using Ipyrad which also I used to export the consesus sequence for all of my samples both in a concatenated sequences phylip file and as a SNPs-only phylip file.

Should I be using the whole sequences for my phylogenetic reconstructions or only the snps?

Also, I know that a best-fit nucleotide substitution model should be chosen before trying to build a phylogenetic tree. However, I have not seen any consensus on published papers of how this is done with RADseq data. Some articles choose a nucleotide substitution model without explaining how they chose it, others use a software as IQ_Tree to estimate a single model for their whole data set, while others use software as PartitionFinder to split their data and estimate a substitution model for each partition.

Which would it be the best approach? Specially considering that my data sets have high amounts of missing data (around 60%). And, if a model-test should be done, which is a good program to work with these large files?

Any help will be kindly appreciated!

More Luis Rodrigo Arce-Valdés's questions See All
Similar questions and discussions