Any suggestions on which software to use and I would like to know if I can use aligned gene sequences in FASTA format and then concatenate or first concatenate all the genes and then align for different species and use for phylogeny.
1) you can use an online Fasta alignment joiner tool to concatenate gene sequences: http://users-birc.au.dk/biopv/php/fabox/alignment_joiner.php; gene sequences have to be in fasta format and in the same order.
2) Align each data set separately before concatenate.
Sequence Matrix (Vaidya et al., 2010) facilitates the assembly of phylogenetic data matrices with multiple genes. Files for individual genes are dragged and dropped into a window and the sequences are concatenated. A table provides an overview over how much sequence information is available for the different genes and species. The user can request Sequence Matrix generate a wide variety of character and taxon sets (e.g. a taxon set with all species that have more than a specified number of genes or basepairs). The concatenated sequences can be exported in NEXUS or TNT format. Individual sequences can be excluded from being exported.
(Download link for recent version of the above mentioned software: https://sequencematrix.googlecode.com/files/SequenceMatrix-Windows-1.7.8.zip)
Reference:
Vaidya, G., Lohman, D. J., Meier, R. (2010) SequenceMatrix: concatenation software for the fast assembly of multi-gene datasets with character set and codon information. Cladistics, early view.
(Dowload link for Reference: http://onlinelibrary.wiley.com/doi/10.1111/j.1096-0031.2010.00329.x/abstract;jsessionid=5828B885FAB6CE39ECBA4027CA4B22EA.f04t01)
thanks for your reply, i too have been using Mega 6 and i need help regarding how to concatenate using mega or any other software. it is clear to me that number of taxa and no of sequences should be same for all the analysed markers. another thing i need help with is whether there is any software we can use to compare two phylogenetic trees also. with regards,
I have an associated question: say you have multiple markers for several species, and dor aome combination marker*species you have more than one sequence. What is the mosr correct way to prepare the concatenated matrix for a Bayesian analysis in Mr Bayea for instance (I think the answer might be different for max pars ou liklihood): would it be to have a single entry of that species with some polymorphic or ambiguous sites, or separate entries for variation? (I know that alternately one could use a BEST or *BEAST approach that estimates the best species topology given several estimates of geneologies. But I still am interested in the question above.)