It is common practice to generate say 16S rRNA gene sequences as part of the identification of bacteria. When you use BLAST to try and identify the organisms you get various stats, including percent identity, score, query coverage etc, as part of the output. My question, which is a simple question actually, is on the decision making process. Lets say you get 90% percent identity to a hit with one sequence, is that enough to assign a taxonomic class. What criteria is used to decide if a sequence belongs to a given genus or species.
Ngonidzashe Mangoma, when analysing bacterial sequences using BLAST, the minimum per cent identity for placement into genus and species is not fixed and depends on various factors. BLAST is a sequence alignment tool that compares query sequences to a database of known sequences. However, genus and species classification typically require a broader range of criteria, such as phylogenetic analysis, sequence features, and biochemical characteristics.
For 16S rRNA analysis, a commonly used method for bacterial identification, a general guideline suggests a minimum per cent identity of 98.65% for placement into the same species and 95% for placement into the same genus. However, it's important to note that these values can vary depending on the context and specific analysis.
Whole genome sequencing (WGS) provides a more comprehensive approach to bacterial classification. The Genome Taxonomy Database (GTDB) is an increasingly accepted resource for bacterial taxonomy. According to the GTDB, for WGS-based analysis, a 95% average nucleotide identity (ANI) cutoff is often used for species delineation. ANI compares the similarity of the genomes at the nucleotide level. For genus classification, the cutoff may be lower, around 70-75% ANI.
In summary, while BLAST can assist with initial identification, genus and species classification require a combination of methods and criteria. For 16S rRNA analysis, approximate cutoffs are 98.65 % for species and 95% for genera. For WGS-based analysis following GTDB guidelines, cutoffs of 95% ANI are commonly used for species, and lower values around 70-75% ANI for genus classification.