Hi Ravi, the use of an OTU-picking algorithm is simplified by assuming a sequence-divergence threshold for OTUs. While fast and readily applicable to very large sequence datasets, these algorithms are sensitive to threshold choice, which is essentially totally subjective on the part of the investigator. Because of this, workers have recently developed coalescent-based species delimitation methods for species discovery based on molecular data. The algorithms I'm referring to (e.g. general mixed Yule-coalescent, or GMYC; poisson tree process, PTP; etc.) are implemented using more rigorous Bayesian and ML algorithms, and so they are more difficult to run on large datasets. You'll need to find a way to pull this off, or to determine which method works best for you. What data are you analyzing, and what is your goal for identifying OTUs? If you just want to determine OTUs in a microbial genome dataset, QIIME would be good. If you have standard mtDNA gene sequences for several hundred (up to max 500-1000?) samples, then you should definitely use the more rigorous coalescent-based methods I mentioned. With multilocus data that are neutrally evolving and show zero or little evidence for migration, you can use programs such as BPP (that can also be run with just mtDNA but are best run using many loci) for Bayesian species delimitation. Again, more info would help us help you here, but I hope this is helpful. Cheers, Justin