Hi everyone, I’m Ifra! As a botanist with experience as a computational biologist, I’m curious to know your opinions on the role of medicinal plants or natural products in the era of AI for drug discovery.
That’s such an exciting intersection of fields to explore. Medicinal plants and natural products are incredibly valuable for drug discovery, and AI is transforming how we approach this. AI accelerates the identification of bioactive compounds, predicts their interactions with targets, and even suggests modifications to improve efficacy or reduce toxicity. It also allows for mining massive datasets like genomic information, phytochemical libraries, and ethnobotanical knowledge to uncover leads that might otherwise remain hidden.
AI is opening new possibilities in drug discovery from natural products by helping to quickly analyze vast amounts of botanical data and predict how plant compounds might work in the body. It aids in discovering hidden therapeutic potentials by modeling molecular interactions, optimizing compounds, and analyzing plant-based chemical libraries to find promising drug candidates more efficiently.
I am currently doing my Ph.D. in exactly this field. I can reassure you it is fascinating, and the natural products hold great promise for the discovery of new life-saving compounds.
Please allow me to provide you with some key review papers that will help you go deeper
IYH Dear Ifra Saifi Here is an analysis of the first review paper h/t Angelos Kollias
AI integrates -omics data (genomics, metabolomics) to predict novel natural products, their structures, and bioactivities, prioritizing candidates for lab testing.
Practical Deployment and Usability:
Real-World Use: AI can prioritize natural product candidates for lab testing, reducing time/cost in drug discovery. For example, CO-ADD’s standardized antibiotic screening protocol accelerates compound validation.
Non-Expert Accessibility: Tools like DECIMER (chemical image recognition) and ReDU (reanalysis of MS data) simplify data processing for researchers without AI expertise.
Example: Botanists studying medicinal plants could use AI to predict bioactive compounds from plant metabolomics data, guiding extraction and synthesis efforts.
Unexpected Findings:
Data Limitations: Despite advances in AI, high-quality datasets for natural products are scarce, hindering model training. This underscores the critical need for data standardization.
Overfitting Risks: AI models often fail to generalize due to biased or incomplete datasets, requiring rigorous validation (e.g., cross-validation, independent test sets).
BGC Prediction Challenges: Deep learning tools like DeepBGC improve BGC detection but still struggle with false positives/negatives, highlighting the need for hybrid approaches.
Structural Complexity: AI struggles to predict novel chemistry or enzyme activities not previously observed, limiting its ability to discover entirely new mechanisms.
Interoperability Gaps: Fragmented natural product databases lack standardized protocols, making data integration (e.g., linking metabolites to BGCs) difficult.
Approach:
Methodology: Genome Mining: Tools like DeepBGC and ClusterFinder predict BGCs from DNA sequences. Metabolome Analysis: AI algorithms (e.g., Spec2Vec) match mass spectrometry (MS) data to compound structures. Bioactivity Prediction: Models like ChemBERTa predict bioactivities from chemical structures, while tools like NPLinker integrate omics data.
Problem-Solving Techniques: Data Integration: Combining genomic, metabolomic, and phenotypic data to predict natural product targets. Algorithmic Advances: Using deep learning (e.g., graph neural networks) to model molecular interactions and predict novel compounds.
Results and Evaluation:
Key Findings: AI tools like DeepRiPP and DECIMER successfully identified novel natural products (e.g., deepflavo, rivulariapeptolides). Models integrating genomic and metabolomic data improved BGC-to-metabolite matching.
Quantitative Results: DeepBGC achieved 80% accuracy in BGC prediction (vs. 60% for rule-based methods). Spec2Vec improved MS similarity scoring by 30% compared to traditional methods.
Notable Achievements: AlphaFold’s protein structure prediction breakthroughs could aid in modeling natural product biosynthesis. NPLinker and GNPS enabled large-scale integration of omics data for compound discovery.