I am using SMOTE in a deep learning project for Minority Class Oversampling. But it costs a huge memory and Session Crash in different IDEs. Can anyone give a solution regarding to this issue?
Today any machine learning practitioner working with binary classification problems must have come across this typical situation of an imbalanced dataset. This is a typical scenario seen across many valid business problems like fraud detection, spam filtering, rare disease discovery, hardware fault detection, etc. Class imbalance is a scenario that arises when we have unequal distribution of class in a dataset i.e. the no. of data points in the negative class (majority class) very large compared to that of the positive class (minority class).
One solution is to use mini-batch SMOTE, which generates synthetic samples in small batches instead of generating all synthetic samples at once. This approach can help reduce the memory requirements and allow SMOTE to handle larger datasets.
Another solution is to use incremental learning techniques such as online learning or streaming learning, which can handle large datasets without requiring all data to be loaded into memory at once.
In addition, some implementations of SMOTE, such as the Imbalanced-Learn library in Python, have memory-efficient versions of SMOTE that can handle imbalanced datasets without consuming excessive memory. These implementations may include optimizations such as sparse matrix representations, which can reduce the memory requirements of SMOTE.
Here is an article for your reference,
(PDF) Heuristic Approach of Over-Sampling and Under- Sampling in Fraud Detection (researchgate.net)
Article Heuristic Approach of Over-Sampling and Under- Sampling in F...