In medical machine learning problems, do we use weighted F1 or do I stick with the standard method to calculate f1. Does this change if there class imbalance?
Yes, in order to address the class imbalance in classification problems, weighted F1 scores are frequently utilised. When one class greatly outnumbers the others, the model's performance may not be accurately represented by conventional metrics like accuracy. In order to evaluate the model's accuracy in classifying instances of the minority class, the Weighted F1 score assigns greater weight to the minority class while accounting for both precision and recall. In order to obtain a more thorough assessment of your model, it is advisable to take into account metrics that address class imbalance.
Generally, F1 score is an evaluation metric for binary classification tasks only. If your classification problem is binary, you should use standard F1 score, since this already takes care of class imbalance.
For all categorical classification tasks, you should always think twice if F1 (micro-averaged, macro-averaged or weighted) is the correct evaluation metric. In most cases, it is not. For imbalanced categorical classification tasks, in most cases the weighted accuracy (https://neptune.ai/blog/balanced-accuracy) is the correct choice for evaluation.
Yes, the weighted F1 score is often used in the context of imbalanced datasets to address the issue of unequal class distribution. In a binary or multi-class classification problem, class imbalance occurs when the number of instances belonging to different classes is significantly uneven. The weighted F1 score considers the class frequencies and assigns different weights to different classes based on their prevalence in the dataset.
The weighted F1 score is calculated by taking the average of the F1 scores for each class, where the average is weighted by the number of true instances in each class. This ensures that the performance metrics are not dominated by the majority class, and the contributions of each class are appropriately considered.