I am using MEGA to build a phylogenetic tree using two genes. They were already aligned separatelly using MUSCLE, but now I don't know how to continue...
Oi Felipe. Tudo bem com você? Eu também quero saber mais sobre estas análises. Vou acompanhar a discussão. Fiquei pensando se não seria neste caso fazer a "análise concatenada" dos dois ou mais genes. Abração.
Sorry for the people that do not comprehend Portuguese.
This is a recurring question nowadays and I am willing to add such a feature in a software that we are developing: http://inf.imo-chile.cl/software/bosque.html
In order to do these, it must be possible to map the source organism of a gene in one alignment to the other fasta alignment. The fasta format is quite limited to provide a trustful way to do this since it only consist of names and sequences. So, the mapping should be somehow encoded in the name of the sequences. This has to be this way because some sequence aligners might change the order of the sequences in the fasta file (e.g. --reorder option in mafft).
Having said this, I don't think MEGA has an option to do this because it is not clear how to do the mapping above described. You will have to program this and construct a new alignment with the concatenated sequences properly matched. If the sequences comes from the Kegg database for example, then this is quite easy, because Kegg provide names such as:
org:gene_name
then the mapping is straightforward from the information in these fields. We have not yet added this feature to Bosque, but will do it soon. For the moment, if you have a particular way to do this mapping and you can send an example of your sequence names I can help you with a little script/program to construct a concatenated alignment. Just send me a message if you want.
Hi all, first you should concatenate your fasta files containing the different loci - I recommend Geneious, the best program to do that. You should just select your files and click on Tools>Concatenate Sequences or Alignments - then you'll have your concatenated matrix ready to run phylogenetic analyses through MEGA or any other software. Alternatives to concatenate sequences are complicated and you'll spend too much time doing that. Geneious has no free version (only 30 days to test). Hope that it helps you.
Didn't know Genious can do this because of the reasons I stated above. Will this concatenation be based on the order of the sequences in the fasta alignment? if this is so then be sure that the input sequences in the original unaligned fasta are in order and then do not use options such as --reorder as in mafft to do the alignment.
Saying that something else "is the worst thing", is kind of the worst thing one can do. There is nothing wrong in assumptions when you are allowed to make them. Of course, if you believe/suspect that one or more of the genes have been acquired recently by lateral gene transfer for example, then I can agree that concatenation can be a bad idea.
Think for example if those two genes were associated to ribosomal proteins (a case of "tightly linked loci"). Those genes can naturally be assumed to have identical evolutionary history. This is something commonly assumed and used in phylogenomic nodaways. So, don't be afraid of concatenation if you are in these cases. Other software, as suggested, fall into another set of assumptions, so there is no perfect solution to this problem. In any case, none of these experiments can be the worst thing you can do, for sure! so test them all and draw your own conclusions.
You should consider if they can be combined together firstly. Then you can use Sequence Matrix to concatenate them. Then you can use Genious to export the MEGA formate file. Now, the file is ready for phylogenetic analysis.