Any suggestions on which software to use and I would like to know if I can use aligned gene sequences in FASTA format and then concatenate or first concatenate all the genes and then align for different species and use for phylogeny.
1) you can use an online Fasta alignment joiner tool to concatenate gene sequences: http://users-birc.au.dk/biopv/php/fabox/alignment_joiner.php; gene sequences have to be in fasta format and in the same order.
2) Align each data set separately before concatenate.
Sequence Matrix (Vaidya et al., 2010) facilitates the assembly of phylogenetic data matrices with multiple genes. Files for individual genes are dragged and dropped into a window and the sequences are concatenated. A table provides an overview over how much sequence information is available for the different genes and species. The user can request Sequence Matrix generate a wide variety of character and taxon sets (e.g. a taxon set with all species that have more than a specified number of genes or basepairs). The concatenated sequences can be exported in NEXUS or TNT format. Individual sequences can be excluded from being exported.
(Download link for recent version of the above mentioned software: https://sequencematrix.googlecode.com/files/SequenceMatrix-Windows-1.7.8.zip)
Reference:
Vaidya, G., Lohman, D. J., Meier, R. (2010) SequenceMatrix: concatenation software for the fast assembly of multi-gene datasets with character set and codon information. Cladistics, early view.
(Dowload link for Reference: http://onlinelibrary.wiley.com/doi/10.1111/j.1096-0031.2010.00329.x/abstract;jsessionid=5828B885FAB6CE39ECBA4027CA4B22EA.f04t01)
thanks for your reply, i too have been using Mega 6 and i need help regarding how to concatenate using mega or any other software. it is clear to me that number of taxa and no of sequences should be same for all the analysed markers. another thing i need help with is whether there is any software we can use to compare two phylogenetic trees also. with regards,
I have an associated question: say you have multiple markers for several species, and dor aome combination marker*species you have more than one sequence. What is the mosr correct way to prepare the concatenated matrix for a Bayesian analysis in Mr Bayea for instance (I think the answer might be different for max pars ou liklihood): would it be to have a single entry of that species with some polymorphic or ambiguous sites, or separate entries for variation? (I know that alternately one could use a BEST or *BEAST approach that estimates the best species topology given several estimates of geneologies. But I still am interested in the question above.)
I have discovered a new tool named as PopART for concatenated sequences. maybe you can use it for your studies. PopART (Population Analysis with Reticulate Trees) is a free, open-source population genetics software that was developed as part of the Allan Wilson Centre Imaging Evolution Initiative. This is a collaborative project involving mathematicians and biologists from five universities and research institutes across New Zealand to develop better software to understand evolutionary relationships among populations.
This question is one of the most challenging, especially for new researcher stepping into bioinformatics. i have used most of the software here but not making progress. FASconCAT is a perl script that should work on linux. I am having 14,000 ortholog genes that i want to concatenate but none of these software/scripts has given me results. FASconCAT most times stops at maybe 70% and wouldn't stop... any help or suggestion?
You may also try PhyloSuite (https://dongzhang0725.github.io/), which is a free software for evolutionary phylogenetics studies. A brief example for how to concatenate genes is here: https://dongzhang0725.github.io/dongzhang0725.github.io/documentation/#5-7-1-Brief-example .