When i collect data of properties of small molecules from public domain,like pKa,solubility and clearance.do i have some methods to evaluate the quality of the datasets before train a machine learning model.
You can check alvaMolecule, a free software for academic and non commercial research that can be used to to visualise, analyse, curate and standardize your molecular dataset. You can find more info at: https://www.alvascience.com/alvamolecule/
You can also check the webinar "Molecular data curation with alvaMolecule" at https://www.youtube.com/watch?v=R670lqrga9k