After high-throughput sequenceing of 16S rDNA, the sequencing depths of different samples usually vary a lot. The sequencing depth can affect alpha and beta diversity analysis, therefore, we usually used the strategy of rarefaction (randomly sub-sampling of sequences from each sample) to equalize the number of sequences per sample. But when we performed functional genes' diversity (e.g. amoA gene of ammonia-oxidizing microorganisms), we often used a clone library method due to the limitation of read length of NGS. As a result, we only obtained very limited numbers of sequences (e.g. 50 to 100 sequences varied among samples) in each sample. If we randomly sub-sample like the 16S rDNA data, we may lost nearly half of the sequence number in some samples and this should have great influence on the alpha or beta diversity. So, in this case, if we can calculate the alpha and beta diversity based on the relative abundance data of OTU? i.e. before calculating the diversity, the data in each sample were firstly unified through divided the total sequence number in each sample. Is this transformation is reliable and scientific? Is there anyone using this method to calculate alpha or beta diversity? If you have related references, I will very appreciate that.
Thank you!