19 September 2023 3 2K Report

I am currently working on a prediction-project where I am using machine learning classification techniques to do this.

I have already computed various classification metrics like accuracy, precision, recall, AUC-ROC, and F1 score. What I am struggling with is how to (objectively) interpret these metrics in terms of their quality. For instance, in frequentist statistics, there are established ranges for interpreting effect sizes (e.g., small, medium, and large effect sizes).

Is there a similar set of guidelines or conventions with a citable source for interpreting classification metrics? I am particularly interested in categories like "poor", "sufficient", "satisfactory", "good", and "excellent" or similar.

I understand that the context and the specific task are crucial for any interpretation, but I still need a citable source that provides general guidelines, especially because this involves an educational context.

Thank you in advance!

More Marcel Jud's questions See All
Similar questions and discussions