What softwares could i use to evaluate consistency on a Bayesian Phylogeny?

Hi Ana,

Happy New Year! Please find link attached may useful for you?

https://www.researchgate.net/post/What_is_the_best_choice_between_Maximum_Likelihood_and_Bayesian_Inference_for_inferring_phylogenetic_relationships_especially_at_low-taxonomic_levels

Brian Thomas Foley

MESQUITE is a phylogenetic analysis package (as are MEGA, PHYLIP, PAUP, DAMBE and a few others). A package such as MESQUITE has many advantages such as having built-in functions for consistency index scoring, but a disadvantage is that you need to leard how to use the package, such as importing your data file, importing your treefile, and running the job you want. It can sometimes be frustrating to figure out exactly what data format is needed for each type of data.

Anyway, MESQUITE does have a consistency index module built in. I do not find this built in to DAMBE or MEGA.

Before you go to a lot of trouble calculating the consistency index value for your data and tree, I think you should find out if you will gain any useful information from this value. Do you know what a "good" value should be for your type of data, for example? The Consistency Index can be very useful for morphological character data sets in some organisms where morphology evolves nicely. For DNA and amino acid sequence data the consistency index usually does not give us much information about the quality of the data or the tree.

http://mesquiteproject.wikispaces.com/home

Ana Agapito

Hi Ali and Brian, thanks a lot for the answers.

Ali, according to what I read it would be best for me to use the ML approach as my studies are in a low taxonomic level (species and subspecies), so the Bayesian Inference could rely too much on my prior. But in case I include data from several genes, could I be compensating the lack of information? How could I check the influence of my data (informative or non informative) on my posterior distribution?
Brian, actually i wanted to make a tree from DNA sequence data from two different types of genes, so I'd like a quantitative way to compare how similar these two trees turn out to be. What Index or other method would you suggest?

Brian Thomas Foley

Not all "species" are created equal. The definition of a "species" and the diversity within and between species can vary a lot between organisms such as viruses, plants, bacteria, fungi, fish, mammals, etc.

I suspect that the consistency index will be similar for all of your data matrices and trees if each matrix and tree is built from one gene at a time. But if you concatenate or merge all of the genes into one file and build a tree, you might get a much lower consistency index if there has been recombination such that different genes have different histories.

With sexually reproducing diploid organisms such as plants, animals and fungi we expect a lot of recombination. Some modern humans apparently interbred with Neanderthals in Europe, for example so some alleles there have quite a different history than the majority of the alleles from other modern humans.

In most animals, the mitochondrial genome tends to be inherited through the female lineage in eggs and not in sperm, but there are exceptions to almost every general pattern like that, and horizontal transfer of mitochondria has even been suggested or proven in some animals.

A very nice alternative or supplementary analysis to the consistency index is the Quartet Puzzling diagrams produced by TreePuzzle, or various analyses of recombination in the RDP (Recombination Detection Program). But it really depends on what type of organism you are studying, how the subspecies diverged from one another, and so on.

http://carrot.mcb.uconn.edu/~olgazh/bioinf2010/class30.html

Jamie R Stevens

Hi Ana,

Picking up on your second point in your follow-up post, I think you're wanting to assess congruence between phylogenies. To address this in a quantitative way, you need to implement a congruence test. I haven't run these for a while, but I suggest you search on incongruence-length difference (ILD) test and the Kishino-Hasegawa test. I've run these with old-style parsimony-derived trees and ML-derived trees, but (apologies) haven't performed such analyses with Bayesian trees... Nice thing is that you get a congruence statistic, rather than just eye-balling your two (or more) trees.

Ana Agapito

Thanks a lot Jamie, i was thinking of using a ML algorithm, so I'd like to know what softwares did you use for running congruence tests on these trees.

Jamie R Stevens

I used PAUP, but I'm sure there must be more recent implementations of these tests.

See also:

Kishino, H., Hasegawa, M., 1989. Evaluation of the maximum likelihood

estimate of the evolutionary tree topologies from DNA sequence data,

and the branching order in Hominoidea. J. Mol. Evol. 29, 170–179.

Farris, J.S., Ka¨llersjo¨, M., Kluge, A.G., Bult, C., 1995. Testing significance

of incongruence. Cladistics 10, 315–319.

Michael Krug

Hi Ana,

check out LRT (likelihood ratio test) (e.g. in concaterpillar), as it can distinguish between topology congruence between trees from two partitions and branch length compatibility. ILD (incongruence length difference test) relies entirely on branch lengths in a parsimony context. If you work on very distantly related taxa, it is likely to find different substitution rates between clades in one gene, but not in the other.

Additionally, Raxml can calculate the Pearson correlation coefficient between two sets of bootstrap samples (-f m), and this might be better than just a p-value that give you a hard time to interpret properly.

Thomas Marcussen

Hi Ana,

A quantitative way to compare the topological similarity between two trees is the Robinson-Foulds metric. It can be calculated in a number of softwares, but I have only used PhyloNet; this was a couple of years ago and there seem to be more options now.

However, from what you write it seems like your idea is to concatenate the two sequences/alignments for analysis. That is not a good idea. Doing this actually requires that the two loci are linked (i.e. segregate together in the meiosis) with the result that coalescent stochasticity = 0, which is rarely the case. Assuming that you're interested in obtaining a species phylogeny, you'd be much better off doing a coalescent analysis, in which the two gene phylogenies along with the species phylogeny are reconstructed simultaneously. The *BEAST plugin to BEAST will do that for you, alongside many other programs.

What is wrong with my DIYABC demographic scenario?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?