I am making a library of n clone. So is there any algorithm to simulate how many cfus are required to cover this n-clone library? Which probability distribution does this situation belong?
It's just basic Poisson statistics. If you get 10n cfu, you'll be covering >99.99% of your input library.
There are assumptions, however. First, we assume that the starting library DNA is far larger than the subsample of DNA being taken for transformation (ie. there are many, many copies of each plasmid clone DNA). This way, we can treat it as sampling without replacement and the Poisson distribution is valid. The second assumption is that each clone will transform and grow equally well. If you suspect that there may some difficult clones in your library, you might want to aim for more coverage (eg. 20-30n).
I have to point out that the library is of sgRNA. Actually I have read some papers that claims a minimal 100-fold more cfus are required for their ~65000 library(means 65000X100 cfu in total) while 500-fold more cfus for custom library. Do you think it's because the distribution of original clones were not even?
Yeah, I've read that before too. I guess the higher redundancy of each individual clone will ensure that each guide is at a high enough abundance to be useful/detectable in the experiment.
The Poisson calculation I posted above is just the probability that all sequences will be represented in the library at least once [ie. P(k>0)>0.9999]. If you wanted to ensure that least abundant clones described by this distribution still hit some minimum frequency [eg. P(k>10)>0.9999], then you'd need to increase the average frequency (lambda) accordingly.
I am a noob in probability calculation. But I guess that if I want to ensure the least abundant clones' fraction, the maximum probability differences between guides should be required. Is there any existed statistic describing that?
I've heard about weighted Poisson distributions that might allow you to do that. Generally (depending on how the library was made), it's assumed that the clonal distribution is even. I suppose you could sequence the library before growing it up to confirm that that is the case