Hello all,
I am having an issue with 16S PICRUST data. There is always a warning message post PICRUST run that more than half of the sequences have been removed from further analysis. The reason might be that the ASV fasta files contains mix DNA sequences i.e. both positive and negative strands. PICRUST can only deal with positive sequences hence the output is based on approximately 50% of the sequences of FASTA file. I am really looking for some suggestion (computer programming) on identifying negative sequences from FASTA files based on NCBI BLASTn portal and reverse complementing it. Because this work would be difficult to be performed manually considering 6000 sequences of FASTA files. I have limited knowledge in coding. Any help would be greatly appreciated.
I am running this PICRUST pipeline as mentioned here https://github.com/picrust/picrust2/wiki/Full-pipeline-script. The ASV file has been generated by using raw FASTq files on QIIME2.