I want to see the composition and relative abundance of potential animal pathogens in my amplicon dataset, and if there is a suitable database for the annotation? Thanks!
I don't think it is possible to define which bacterial "species" will be pathogenic to one or more animal lineages. For example most E. coli are harmless to most vertebrates, but if a phage encoding the Shigella toxin integrates into the genome, the same bacteria can be deadly as a STEC (Shiga toxin–producing Escherichia coli). Many species of Clostridia can gain or loose the botulinum toxin operon. So the 16S ribosomal RNA sequence does not tell you if the genome is from a pathogenic strain or not. Some bacteria my be harmless to invertebrate animals and only pathogenic to vertebrate animals. What range of "animals" are you interested in? How will you define "pathogenicity"? Many bacteria are harmless to humans if they remain in the intestines, but deadly if they enter the bloodstream, for example.
In general, 16S ribosomal RNA sequences can give a ballpark estimate of the genus or species level of bacteria present, but cannot tell pathogenic from nonpathogenic lineages within a genus or species.
Dear Professor, as you said, it is indeed difficult to accurately determine whether a species is pathogenic with 16S data, and the bacteria that cause disease in different animals are often different. We currently have some data about bacterial communities in the plastisphere and we hope to make a rough assessment of the pathogenic potential of these samples. Thank you very much for your kind reply! Brian Thomas Foley