"Bottom line is if you already have trees with greater resolution you do not need to combine data sets."
Greater resolution than the combined tree? If this was the case, if you had multiple trees from multiple genes, we would simply select the best resolved gene tree. A well-resolved gene tree does not imply that the signal generating the tree is phylogenetic. If your genes trees are incongruent, even more reason to combine them, since it is only through multi-loci datasets that gene-specific bias can be minimized, allowing true phylogenetic signal guide the tree.
An alternative bottom line might be: if your single-loci signal is strong, assume bias. Bias can then be mitigated through increasing loci in the dataset, meaningful testing of those data for orthology, and realistic modelling of the resulting refined dataset. Node support may well decrease (compared to some gene trees), but this more realistically defines the confidence of that node, given the data.
Actually we are trying to infer the phylogenetic relationship of 10 different samples and when I tried constructing the tree using the two loci they produced different results. I am new to phylogeny and did not know what to do so was wondering if it was possible to combine the data into a single tree or not. I am trying to learn!
It is absolutely possible to combine data sets, however, first of all you need to decide if you really want to combine these data sets. For that you should consider two things; 1. do your separate data sets produce congruence trees?, 2. do they resolve with high support values?.
Mainly you will have to combine data sets if your separate data sets do not resolve trees with higher support values. When the trees are more or less congruent, combining data sets will enhance the resolution with higher support values. But if they are not congruent and are significantly different then you'll mots probably lose resolution and will have a totally different tree. Hence, you need to decide if you need to combine data sets. Bottom line is if you already have trees with greater resolution you do not need to combine data sets. (however, you can still combine them and they will produce better trees for interpretation)
You can combine data sets just by copying and pasting one sequence at the end of the other of a particular species (individual).
hope this helps. let me know if you need more explanation.
"Bottom line is if you already have trees with greater resolution you do not need to combine data sets."
Greater resolution than the combined tree? If this was the case, if you had multiple trees from multiple genes, we would simply select the best resolved gene tree. A well-resolved gene tree does not imply that the signal generating the tree is phylogenetic. If your genes trees are incongruent, even more reason to combine them, since it is only through multi-loci datasets that gene-specific bias can be minimized, allowing true phylogenetic signal guide the tree.
An alternative bottom line might be: if your single-loci signal is strong, assume bias. Bias can then be mitigated through increasing loci in the dataset, meaningful testing of those data for orthology, and realistic modelling of the resulting refined dataset. Node support may well decrease (compared to some gene trees), but this more realistically defines the confidence of that node, given the data.
Alastair, I appreciate your point of view to this question.
What I understand from my experience is that there is no precise phylogenetic tree. What you can produce is an arbitrarily close tree to what you believe is the most possible tree. Having a huge data set also doesn't necessarily mean that you will get the most possible tree. All you need to do is do several tests and model simulations to obtain plausible and congruent trees.
In the case of multiple loci, it is important to have congruent trees and you do not simply select the best trees and go with them throwing away the rest. If you find any tree that does not agree with the majority of the trees, that poses a question to the suitability of that gene region for your job. If the combined data set spits out a tree with higher supporting values eliminating the ambiguous nodes, you can always chose it.
Yes, indeed, combining data sets is a good practice in phylogenetic analysis. However, you should always keep in mind that you are trying to support your hypothesis with the most supporting evidence. It's merely different opinions of systematic biologists which one is better.
Yes, you can produce a tree from multiple loci. I suggest running the analyses with each locus individually then combining them to see differences. Different genes have different evolutionary rates and thus can help answer different questions (population level, species level, genus level, etc.).
Here it is, Sequence Matrix (Vaidya et al., 2010):
Sequence Matrix facilitates the assembly of phylogenetic data matrices with multiple genes. Files for individual genes are dragged and dropped into a window and the sequences are concatenated. A table provides an overview over how much sequence information is available for the different genes and species. The user can request Sequence Matrix generate a wide variety of character and taxon sets (e.g. a taxon set with all species that have more than a specified number of genes or basepairs). The concatenated sequences can be exported in NEXUS or TNT format. Individual sequences can be excluded from being exported.
(Download link for recent version of the above mentioned software: https://sequencematrix.googlecode.com/files/SequenceMatrix-Windows-1.7.8.zip)
Reference:
Vaidya, G., Lohman, D. J., Meier, R. (2010) SequenceMatrix: concatenation software for the fast assembly of multi-gene datasets with character set and codon information. Cladistics, early view.
(Dowload link for Reference: http://onlinelibrary.wiley.com/doi/10.1111/j.1096-0031.2010.00329.x/abstract;jsessionid=5828B885FAB6CE39ECBA4027CA4B22EA.f04t01)