Hi, I'm curious to know if data on chemical compounds from PubChem, such as water solubility properties, can be used to train a machine learning model for commercial purposes. Will this infringe on any licenses?
Data from PubChem is largely in the public domain and can be freely used for various purposes, including commercial applications. The National Center for Biotechnology Information (NCBI), which hosts PubChem, explicitly states that the information is intended to be freely accessible and usable by the public. According to PubChem's terms of use:
PubChem data is freely available to the public.
There are no restrictions on the use of data retrieved from PubChem, including for commercial purposes.
Users are encouraged to attribute the source of the data when sharing or publishing results.