I am trying to analyze the relation between GC content and CDS length. That of Mycobacterium tuberculosis Haarlem3 NITR202 uid202216 is quite different with other substr. of Mycobacterium tuberculosis. How can one explain this?
its simple. When you know above mentioned strain then you have to search out the changes(variations) in its genome. If changes are less than 10% then you have to take it as a different sub-strain. If 20-30% variation is present in genome then it is a different genotype.