I was wondering if somebody can guide me regarding how representative cDNA libraries are e.g. do they actually show just 5% of full transcriptome or do we have libraries that show more coverage?
The question is insufficiently specific. Which cDNA library? How was it made, from polyA selected mRNA? Random primed or polyA primed? What tissue - heterogeneous cell population or clonal cell line? What would you mean by representative - by proportion of transcript population represented or by proportion of transcript types represented, or even proportion full length transcripts? Finally and most important, what do you want it for?
If you want it to clone a cDNA from, then bear in mind that the complexity of such a library is typically 1-5 million "independent" clones which rather limits your chance of finding a minor transcript from a minority cell type in the sample (e.g. cloning a neuroblast transcript from a total embryo cDNA library). A large proportion of those will come from a minority of genes just because that's the case with the source mRNA. The proportion of long inserts (>3 kb) is small so if you are hoping to clone a full length cDNA from a long transcript, you'll need to be using a random primed library and going through multiple cycles of cloning and followed by ultimate assembly. The law of cussedness dictates that any gene you want is invariably only transcribed as a very long, alternatively spliced transcript at low levels in a minor population. After all that, wouldn't you be better off doing it by PCR in many cases? Or even total synthesis if you already know the sequence. The latter is only expensive when you don't factor in the labour costs of going through a conventional cDNA cloning exercise but hey, that's the way academia works :-)
Thanks for the details. i am actually reading up on old literature papers where they used random primed cDNA libraries to identify a gene transcript. it's clear to me now. Thank you.