I have a problem of classification into 27 classes with high imbalance between features. The distribution of the instances over the classes is shown below.
I am evaluating different models to find the best sampling method to resolve the problem and the optimal algorithm. I am currently performing cross validation (n_split = 5) looking on the accuracy and macro recall.
Be aware that the classes are all of the same importance for me. While the tested algorithms are decision tree, random forest and boosted trees (gradient, ada ...).
Is this the proper testing method? Any Recomendations?
Which metric is more relevant in such case (Imbalanced & Multiclass): accuracy, g-mean accuracy, fi, macro recall ....????