The Database that I use to build QSAR model contains 40 compounds with a common basic core, could I use others compounds with the same biological activity of my database but with other basic core.
In this case, you can use 2D-QSAR but your created model may has more variables and lower R2. Using databases with different core is a problem for 3D-QSAR models because it leades to poor alignment but in 2D-QSAR there is no alignment so there is no problem.
It is better to use experimental biological activities obtained from a single lab. Using compounds with different cores can enhance your qsar calculation, but they should be reported from one group.
The way the IC50 are experimentally determined in these databases (/publications) vary quite substantially. Some were determined using commercial kits, some were determined using proteins expressed in-house, and some were determined by CRO using “know-how” procedures.
It is unlikely that these IC50s can be readily compared and used in QSAR equations without a “calibration”. In our experience, the literature IC50 values for the same compounds can vary 1000s of folds.
Therefore, you must check the databases (experimental protocol of biological activity determination) and carefully perform data curation, if you want to use different databases.
If the two cores are sterically very similar so that the substituents off the core are likely to sit in the same place in the active site, then you can create a 2D QSAR using both sets of data (preferably, as stated previously, if the data for both sets was created in the same lab). If using standard linear regression techniques you can do this by including an indicator variable that takes the value 1 for one core and 0 for the second core. The coefficient on the indicator variable, in the final model, gives you the difference in activities, on average, between the two cores.
If the cores are so different that the two series are unlikely to have similar binding modes then it is unlikely that this approach will work.
Ensure that the compounds of the different databases were tested using the same experimental protocol. Also check if the duplicates have concordant data.
This rule of using only data from the same article/database is old-fashioned!
You should check the experimental protocol if they are similar and the cut-off value is the same you can use compounds from different sources to build your QSAR data. Indeed you should used more compound with the same activity because a serie of 40 congeneric compounds is very limited.