I work on E. coli genomes and while going through the various genes present, I have seen (link) that in the coordinates area of the description it is suggested to join different regions of genome.

join(1463416..1465928,1467265..1467317,1468541..1472037)

I was wondering why is this so? Isn't a gene sequence meant to be continuous?

Another thing that I am not clear about is the difference between a gene and CDS - I see cases in E. coli (based on NCBI annotation file in FTP site) where mutiple CDS are present with gene, some times they overlap also. And in most of the cases the gene and the corresponding CDS have the same coordinates.

Please somebody clarify these two things.

http://www.genome.jp/dbget-bin/www_bget?eco:b4492

More Ravi Kanth Reddy Sathi's questions See All
Similar questions and discussions