Like there are data-augmentation techniques for image classification and text-based data, are there any analogous techniques for numeric data-sets that can be used to expand the size of a scanty data-set?
Augmenting the images is easier and simple as relationship between the pixels and label assignment can be maintained. Whereas, perturbing a dataset with categorical and numeric features can perturb the data sample into entirely different class. However, unsupervised machine learning algorithms can be used for randomly perturb features in each subset using the distribution’s mean and standard deviation as perturbation bounds.
Yes, data-augmentation techniques are useful in the unbalanced-data area. Generative Adversarial Networks (GAN) can generate realistic data, which is beneficial to train the model.