Hi,

I am very new to Bioinformatics. Recently, I have the project that aims to perform taxonomic anlysis on raw reads from mixed samples taken from environment.

For example, we perfromed NGS (whole genome) on a insect and we want to identify the taxonomic of every symbiotic bacteria in the raw reads.

Currently, I can use Kraken2 to perform the analysis. However, I have following few questions.

1. How can I focus only on the bacteria and remove the rest of data and make a summary table or visualization (the percentage of each bacteria strain).

2. Because in the future, we will switch to the mixed environmental samples and focused on 16s rRNA, I would like to know how to perform the taxonomic analysis by identifying 16s rRNA first from the raw reads and make analysis that focuses only on 16s rRNA. Because mixed environmental samples will contain not only bacteria but also other eukaryotic DNA reads, I want to identify them and analyze them later to reduce the process time.

Followed by that, what process and tools shoud I use?

I found the tutorial: " 16S Microbial Analysis with mothur (with galaxy)" however, I tried with the current data of NGS data on insects, it took so much time on making contigs of non bacterial reads. I am wandering if there is any methods that can get rid of reads that is non bacterial in the first hand.

Additionally, I found other tools such as RNAmmer, barrnap, prokka. However, these tools seems to be only accepting bacterial whole genome but not mixed reads.

If you can share some experience and good workflow or tools to try, I will very appreiate that.

Thank you very much for your great help.

More Shang-Wei Li's questions See All
Similar questions and discussions