Deep learning is suitable in the domain of image classification, object detection when dataset is unstructured and must be larger. Is oversampling is suitable to make the dataset larger and could be used to enhance the model performance?
Firstly Deep Learning models isn't for unstructured or larger data set, Deep learning is used when there exists a highly non-linear relationship among inputs and outputs and when the system is complex where we can't determine all inputs or reasons generating the outputs.
For any smaller data set , it's hard to determine that which technique should use. It depends on data and the underlying pattern and relationship among inputs and outputs. I can suggest some strategies for solving the problem :
Firstly , for classification task, some feature selection and extraction algorithms (PCA, NCA) can be used to test the weights of inputs ( impacts of individual inputs on outputs) .
Since data is small, so some classifiers like, KNN, SVM, Gaussian process regression, Bayesian algorithm can be used for capturing the pattern .
If possible, it's a good idea to model the process mathematically with system of equations and simulating that model for different parameter values numerically for observing the variation of simulated outputs. But it is possible if anyone know something about the architecture and dependencies of the system through which the final outputs are generating . After doing this, you'll be able to justify or predict which model is performing well in this case by measuring similarities .
Deep learning model can be used with smaller data set too but in that case data should be very accurate and without noise .
And lastly, testing Interpretability of the model with related theories or by using statistical techniques is useful here .
Using different distance metric can improve the performance of model .
There are two major ways of doing deep learning on a small dataset. The first one is to use a pre-trained model on a task with large datasets, such as the image classification task on ImageNet. This is the well-known transfer learning. However, this is not guaranteed to be able to improve the diversity of the `small' dataset and the generalisation capacity of a trained deep model. To further address the issue, the second common way is to use a proper data augmentation approach, which can avoid over-fitting and improve the generalisation capacity of the trained deep network.
Firstly Deep Learning models isn't for unstructured or larger data set, Deep learning is used when there exists a highly non-linear relationship among inputs and outputs and when the system is complex where we can't determine all inputs or reasons generating the outputs.
For any smaller data set , it's hard to determine that which technique should use. It depends on data and the underlying pattern and relationship among inputs and outputs. I can suggest some strategies for solving the problem :
Firstly , for classification task, some feature selection and extraction algorithms (PCA, NCA) can be used to test the weights of inputs ( impacts of individual inputs on outputs) .
Since data is small, so some classifiers like, KNN, SVM, Gaussian process regression, Bayesian algorithm can be used for capturing the pattern .
If possible, it's a good idea to model the process mathematically with system of equations and simulating that model for different parameter values numerically for observing the variation of simulated outputs. But it is possible if anyone know something about the architecture and dependencies of the system through which the final outputs are generating . After doing this, you'll be able to justify or predict which model is performing well in this case by measuring similarities .
Deep learning model can be used with smaller data set too but in that case data should be very accurate and without noise .
And lastly, testing Interpretability of the model with related theories or by using statistical techniques is useful here .
Using different distance metric can improve the performance of model .
One solution is to use transfer learning for enhancing the performance. Where a pre-trained model which is trained on some large dataset is retrain with same parameters except the parameters of final layer.
Deep Learning is a logistics regression at the end of the layer, where as hidden layers which automatically extracts good features representation for that end layer from the data. So, to extracts good features representation need more data.
If you have small dataset, best go with transfer learning or word2vec, which trained on lots of data.
Otherwise, reduce no.of feature of the data by using PCA or other dimensionality reduction technique.
- Jittering : or increasing the training dataset. You can use artificial additive noise to your data in order to generate more "data". This is a known technique for regularizing the data set and avoid overfitting. You can google for "Stacked denoising autoencoder".
- Regularization : modifies the objective function that has to be minimized). See for L1/L2 regularization and Tikhonov regularisation.
- Dropout : You can force the network to become dependent on certain neurons to avoid overfitting.
- Check your separation criteria for training and validation data.
- Paramater norm penalty (weight decay): penalize large paramter values
- Check output at different number of iterations for testing dataset and stop training at the point where performance degradation starts for testing dataset
- Reducing Architecture Complexity: use architectures designs that generalize well
To resolve the issues of big data, deep learning techniques are proposed. And better results can be achieved if learning is done on large data. But if have small data sets then use data augmentations to resolve small data problems.