I have aligned the complete mitochondrial genomes of well over 200 vertebrates, mostly mammals with some birds, reptiles, amphibians, fish for outgroups. The mitochondrial DNA evolves about ten-fold faster than nuclear DNA, and the mitochondrial genomes of the most distant mammals (say rodent compared to primate; or even lemur compared to chimpanzee) are beyond saturation with mutations.

I also have a multiple sequence alignment of the elongation factor 2 gene mRNA/cDNA sequences from hundreds of species. Again the alignment is fairly easy to produce, because this protein is highly conserved in all life forms. A BLAST search for example can pull up hundreds of these sequences and from there it is simple to download the cDNA/mRNA coding regions and align them.

I now want to pull the complete elongation factor 2 gene, with introns, from each of the several hundreds complete genomes of vertebrates that have been completed. Not all of the genomes have been fully assembled and annotated, but most of the vertebrate genome projects have at least large contigs completed and I can find the EF-2 gene annotated in many of them. However, I am having difficulty finding a tool that will use either BLAST similarity, or annotation-based searching, to scan all of the genomes for the EF-2 gene and download the complete gene (maybe including 500 bases upstream and downstream flanking region) with a file name or sequence name that tells the species/accession number.

With some tools, I can search one genome at a time, look up the EF-2 gene in the annotation, and then ask to download/save that region of the genome, then use another tool to give the sequence a good name. With BLAST, I can get lots of "hits" that sometimes contain a complete gene, but more often shatter each gene into many individual hits, and download them all, but they are a jumbled mess with gi-numbers for identifiers and so on.

I know there has to be a better way to get the data I want here. I know there are databases of gene homologs and so on, and perhaps I can find what I want in one of them. Can anyone help me?

More Brian Thomas Foley's questions See All
Similar questions and discussions