I have carried out SMILE enumeration for data augmentation in ML model. Originally I had 300 SMILES and have augmented it to 10-fold, thus resulting in 3000 SMILES (Reference: Article SMILES Enumeration as Data Augmentation for Neural Network M...
). I want to use the augmented data to train a model to predict IC50 values. So, for the enumerated SMILES, should I be using the IC50 value of its respective parent SMILE? Kindly guide me. Thanks in advance!