I am working on a binary classification problem that is very unbalanced, e.g., 90% class 0, and 10% class 1. I am experimenting with filtering the feature matrix to increase the percentage of class 1 observations (e.g., remove all observations where feature 1 is < X and feature 10 > Y), which has shown promising results. I'm trying to find good methods of doing this, any suggestions, links, or keywords to related processes would be really appreciated!

Due to the nature of the problem, the precision of classifying class 1 is my priority (i.e. it is okay if only 0.5% of observations are identified as class 1, as long as those classifications have good accuracy). Considering this, I don't mind losing a significant portion of class 1 observations from filtering, assuming class 1 is the resulting dominant class.

Similar questions and discussions