Dataset imbalance is a major problem usually encounter in machine learning practice. Some technique such as SMOTE or ADASYN generate addition data (synthetic data) to avoid imbalance problem. However I was wondering about the quality and correctness of these synthetic data and their corresponding label.
Should we place our trust on the new data label ?