With HIV-1 phylogenies we quite often have data sets where we know important details about the true evolutionary history. For one example, with a local transmission chain such as husband to wife to infant chain we can often know for certain who was infected first and the dates of transmission events etc. In other cases the rough epidemiology is known so that we can tell that an epidemic spread out from a point source introduction.

Very often, or perhaps always given certain relative levels of diversity involved, the phylogenetic trees produced from a data set show a misrooting of one or more subclades which essentially turns that subclade "inside out" by putting the more diverse sequences rooted to the outgroup. The explanation seems to be the "long branches attract problem". In the father->mother->infant case for example, the sequences from the infant often appear on a branch in between the father and mother, when we know for certain that this is "out of order".

What I want is a tool that allows me to "fix" a known misrooting of a clade, and then calculate the likelihood value (or other such measurement) of the "correct" tree vs the misrooted tree. I can provide sample data sets and tree results to anyone who is interested in this.

 

More Brian Thomas Foley's questions See All
Similar questions and discussions