Working GAN which makes use of deep learning (neural networks) to create synthetic data. Does anyone have any idea or material that inculcates the use of classical machine learning algorithms instead of DL? Or is this possible in the first instance?
GANs function by first training a generator network, which produces synthetic data, and then running a discriminator network on the generated data. The gradient of the discriminator network's output with respect to the synthetic data indicates how to gently alter the synthetic data to make it more realistic.
Actually, A generative adversarial network (GAN) is a machine learning (ML) model in which two neural networks engage to improve their prediction accuracy.
Ferdib Al Islam Good question. Numerical/tabular data which like you said SMOTE can reproduce. I have also used GAN to produce such data. but I am looking to see if i can use/or anyone has used a classical ML algo for the discriminator.
This highly depends on the type and quality of data that you want to obtain. If your data is normal distributed, you might want to train a Principal Component Analysis (PCA) projection. Then, you can generate random vectors in PCA subspace and project them back into the original space to generate samples.
The advantage of that method is that it does not require tons of training data for your PCA subspace. The disadvantage is that it requires the data to be normal distributed. If this is not the case, you can go to more complicated models such as Gaussian Mixture Models (GMM) to generate data.
Yes absolutely. The paper titled 'Using Support Vector Machines for Generating Synthetic Datasets ' contains an initial investigation to evaluate whether support vector machines could be utilized to develop synthetic datasets. The application is limited to categorical data but extensions for continuous data should be straightforward.
But the accuracy of deep learning seemed to be better than simple parametric or non-parametric models.
Manuel Günther , that sounds interesting. I will look into this. Javad Dogani , Thank you for sharing the paper. I would also think that DL should perform better