Dataset imbalance is a major problem usually encounter in machine learning practice. Some technique such as SMOTE or ADASYN generate addition data (synthetic data) to avoid imbalance problem. However I was wondering about the quality and correctness of these synthetic data and their corresponding label.

Should we place our trust on the new data label ?

More Tường Nguyễn Minh's questions See All
Similar questions and discussions