Downloading WGS  contigs is easy with Biopython and Entrez if using the older sequence headers, such as:

> gi|351789644|gb|AEHK01402178.1|

This is done with:

from Bio import SeqIO

from Bio import Seq

from Bio import Entrez

Entrez.email = "[email protected]"

handle = Entrez.efetch(db="nucleotide", id=cntg, rettype="fasta", retmode="text")

record = SeqIO.read(handle, "fasta")

However, as of this year (2016), the new headers only have genebank numbers:

> gb|JSUE03030586.1|

Using the above 'nucleotide' database for Entrez does not produce a valid handle (i.e., the entrez query does not produce a record).   I have tried different databases (e.g., "genome", "gene", "assembly", etc.) howerver, none of these produce valid queries.

Does anybody know of a solution?

More David N Olivieri's questions See All
Similar questions and discussions