Usually, bacteria have multiple copies of 16S rRNA genes and also for some other functional genes. But most of the genes seem to exist in single copy in the chromosome. What is the reason for those differences?
This is the abstract of Klappenbach et al. (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC91988/):
"Although natural selection appears to favor the elimination of gene redundancy in prokaryotes, multiple copies of each rRNA-encoding gene are common on bacterial chromosomes. Despite this conspicuous deviation from single-copy genes, no phenotype has been consistently associated with rRNA gene copy number. We found that the number of rRNA genes correlates with the rate at which phylogenetically diverse bacteria respond to resource availability. Soil bacteria that formed colonies rapidly upon exposure to a nutritionally complex medium contained an average of 5.5 copies of the small subunit rRNA gene, whereas bacteria that responded slowly contained an average of 1.4 copies. In soil microcosms pulsed with the herbicide 2,4-dichlorophenoxyacetic acid (2,4-D), indigenous populations of 2,4-D-degrading bacteria with multiple rRNA genes ( = 5.4) became dominant, whereas populations with fewer rRNA genes ( = 2.7) were favored in unamended controls. These findings demonstrate phenotypic effects associated with rRNA gene copy number that are indicative of ecological strategies influencing the structure of natural microbial communities."
The reason for this duplication seems to be purely evolutionary. A bacterial genome is always in balance needing compromise between its length and its efficiency.
A large genome carries a lot of information, but is energetically expensive to maintain. Otherwise there are cell processes that need to be performed at high speed. So is necessary a greater number for fundamental regulatory genes. This allows the polymerases to have a facilitating access. Working faster the growth speed of the cells increase and thus all the species efficiency.
The other genes in single copy don't need to be transcribed so fast because their products are not needed at high concentrations. That's why the only the most important genes are organized in multiple copies. The single copy genes permit to maintain a small genome length, thus a lower use of energy resources.
This is the abstract of Klappenbach et al. (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC91988/):
"Although natural selection appears to favor the elimination of gene redundancy in prokaryotes, multiple copies of each rRNA-encoding gene are common on bacterial chromosomes. Despite this conspicuous deviation from single-copy genes, no phenotype has been consistently associated with rRNA gene copy number. We found that the number of rRNA genes correlates with the rate at which phylogenetically diverse bacteria respond to resource availability. Soil bacteria that formed colonies rapidly upon exposure to a nutritionally complex medium contained an average of 5.5 copies of the small subunit rRNA gene, whereas bacteria that responded slowly contained an average of 1.4 copies. In soil microcosms pulsed with the herbicide 2,4-dichlorophenoxyacetic acid (2,4-D), indigenous populations of 2,4-D-degrading bacteria with multiple rRNA genes ( = 5.4) became dominant, whereas populations with fewer rRNA genes ( = 2.7) were favored in unamended controls. These findings demonstrate phenotypic effects associated with rRNA gene copy number that are indicative of ecological strategies influencing the structure of natural microbial communities."
The reason for this duplication seems to be purely evolutionary. A bacterial genome is always in balance needing compromise between its length and its efficiency.
A large genome carries a lot of information, but is energetically expensive to maintain. Otherwise there are cell processes that need to be performed at high speed. So is necessary a greater number for fundamental regulatory genes. This allows the polymerases to have a facilitating access. Working faster the growth speed of the cells increase and thus all the species efficiency.
The other genes in single copy don't need to be transcribed so fast because their products are not needed at high concentrations. That's why the only the most important genes are organized in multiple copies. The single copy genes permit to maintain a small genome length, thus a lower use of energy resources.
usually so called " house keeping genes" are needed in high conc. as they are very much essential and always needed throughout the life span eg., rRNA genes, genes of metabolic pathways. But some genes and their products are occasionally needed and get expressed only when activated by ligand. So their gene copy number is limited.
I agree to both the contributors Cynthia and Aadhinath. See, its a simple phenomenon of demand and supply. 16s rRNA as you mentioned; plays key role in initiation of Translation because of having complementary sequences to shine dalgarno sequences of mRNA. Now, its important to highlight that, Translation is one of the most robust processes in the cell. So, more and more 16s rRNAs are required to meet the demand, which can be taken care by multiple copies of their genes. Similar is the case of other genes corresponding to proteins/ RNA required in high amount.
plus as Ricardo quoted,
its very peculiar to see that, gene family is also an observation considered to happen because of multiple copies of same gene having mutations in different directions. So, without getting fatal to the organism, the genes are mutated in such a way that the original product keeps on forming with altered one. This increases diversity, provides ample supply and saves space because genetic material has to be of limited base pairs.
There have been studies in E. coli addressing the copy number of the rRNA genes and maximal growth rate is reduced as the copy number of the genes is reduced.
Hi artur thanks for your kind corrections in my response. rDNA as you are saying was covered by me in the statement. Second thing , if somebody is curious for in bacterial systems that does not mean you are also curious only in bacteria. I tried to answer his question to the point first that is why I covered only rRNA. when it was done, I moved towards other reasons behind multiple copies, that does not essentially mean that I am still stuck to bacterial system. I wanted him to know what happens in other systems. And one thing more, Only anabolic pathways in bacteria are under suppression system, Catabolic pathways and their genes are under activation system. Hope it would make you to view my answer with another approach.
For the effects of varying rRNA copy numbers in Bacillus subtilis you may wish to look at: Koichi Yano et al, Microbiology (2013), 159, 2225–2236, Multiple rRNA operons are essential for efficient cell growth and sporulation as well as outgrowth in Bacillus subtilis.
The effects of gowing down in rRNA operon number are quite convincing. For the moment, I do not remember studies on increasing artificially or by selection numbers of single-copy genes in Bacteria, other than those provided by plasmids.
Does the copy number really matters in the term of higher expression? Or maybe promoters and regulatory genes are more important for the level of expression?
@Daniel, yes most would suggest copy number is important at least for rRNA genes, transcription of those genes is already at a very high level in rapidly growing cells and the promoters are close to their optimum.
If you're interested in mechanisms of gene amplification in bacteria I suggest you read some of J. Roth and coworkers papers on gene amplification under selection, e.g. Multiple pathways of selected gene amplification during adaptive mutation.
Kugelberg E, Kofoid E, Reams AB, Andersson DI, Roth JR.
Proc Natl Acad Sci U S A. 2006 Nov 14;103(46):17319-24. Epub 2006 Nov 2.
Beyond the rRNA, there is a report from an in vitro evolutionary experiment by Blount et al (doi:10.1038/nature11514), showing how gene-amplification improves a weak phenotype.
In a long-term culture, some of their Escherichia coli acquired the ability to use citrate from the culture medium as carbon-source under aerobic condictions, which the starting culture did not.
The new phenotype occured when a previously silent citrate-importer captured an aerobically expressed promoter. To Analyse the effect of the copy number, the Group cloned the gene with the new promoter and inserted it in an earlier strain which had not yet acquired the new phenotype, either as single-copy genomic insertion or to be expressed from high-copy pUC19. In the latter case, aerobic citrate-utilisation was far more pronounced.
There's a nice paper by George Fox's group in Houston that have developed a database of rrna operon numbers. It's published in BMC Microbiology (see below). This might be useful in examining microbial community profiles to see how rrna copy number distorts the presence of different taxa (e.g. from pyrosequencing data)
Rastogi R1, Wu M, Dasgupta I, Fox GE.
Visualization of ribosomal RNA operon copy number distribution.
Good point, John, thanks for that. I just found a very recent program for QIIME (and I guess it is still under active development) that automatically corrects the copy numbers (e.g. from pyrosequencing).
https://github.com/fangly/AmpliCopyrighter
This is from the website:
"The genome of Bacteria and Archaea often contains several copies of the
16S rRNA gene. This can lead to significant biases when estimating the
composition of microbial communities using 16S rRNA amplicons or
microarrays or their total abundance using 16S rRNA quantitative PCR,
since species with a large number of copies will contribute
disproportionally more 16S amplicons than species with a unique copy.
Fortunately, it is possible to infer the copy number of unsequenced
microbial species, based on that of close relatives that have been fully
sequenced. Using this information, CopyRigher corrects microbial
relative abundance by applying a weight proportional to the inverse of
The question of why some bacteria do contain multiple copies have been puzzling me since I realized that some bacteria do contain more copied genes than E. coli does. We ended up by finding that multiple homologous copies are fairly rare (see PMID: 24625193), still it is found in quite a few of the bacteria sequenced so far. So it is not just rRNA genes and transposons that are found in multiple highly homologous gene copies. As to why, that is indeed difficult to figure out. We could only hypothesize that for those that have chosen such a strategy it probably confers some adaptive advantage in their natural habitats. It may be that other bacteria achieve the same ends (high expression, co-regulation with alternative other genes or whatever might be the real advantage) by other strategies that works well enough for them given their genotype and habitat.
Concerning the advantage of having multiple copies of the same genes, a good example is the fact that the plant pathogen Erwinia amylovora possesses 3 copies of type 3 secretion systems (T3SS) encoding genes, each copy being important in different eukaryotic hosts: one for interactions with plant, one for insects and one for mammals.
As pointed out by Helga and Frédérique, the reason why some genes are repeated is just exquisitely due to adaptation, albeit it's a rare occurrence.
As Helga said, the sequencing of bacterial genome helps us to figure out the identity and the function of these genes. In this paper (http://www.nature.com/nbt/journal/v21/n11/full/nbt886.html) the entomopathogenic bacterium Photorhabdus luminescens exhibits repeated genes encoding a triacylglycerol lipase, a protein involved in the infection process.
So, even in this case the gene repeats seem to be highly related to increase the growth speed in that particular environment, tightly linked to the bacterium propagation.
Multiple copies of gene should be able to produce more proteins then single copy of gene. e.g. If bacteria required more protein (Enz) to act on particular substrate then one copy of gene should not be able to synthesize that much protein then more copies are required.This could be the most possible reason as others have mentioned. More copies of same gene could be added also by the multiple copies of plasmids containing same gene in stress condition of staple nutrient.
Mutations that result in Amber codon (UAG) suppression in E. coli occur only in tRNA genes that exist in multiple copies. In contrast, Ochre (UAA) and Opal (UGA) suppression sometimes occurs through the use of tRNAs that are encoded by a single, unique gene. Can anybody explain why Amber suppressors never derive from single-copy, unique tRNA genes, whereas this is sometimes the case with Ochre and Opal suppressors ?
I am agreed with Ami Rameshbhai Patel answer that it probably duplication of some genes in a living cell seems to be purely evolutionary with the need to adapt new natural habitats.