I have a supervised problem with 2 labeled classes. Each class contains an unbalanced number of samples. The dataset is a sparse and high dimensional binary dataset of ~30.000 samples and 880 variables. Thank you in advance.
You can use a method proposed by us. The paper is available here: https://www.researchgate.net/publication/273463169_A_Multiobjective_Genetic_Programming-Based_Ensemble_for_Simultaneous_Feature_Selection_and_Classification
If you want, I can send you the code. If you need the code, mail me at [email protected].
In the feature selection context, I recommend the method proposed in " Infinite Feature Selection ". This method is one of most recent state-of-the-art techniques in feature selection.