I have trained a VGG-16 model toward a binary classification task. The model was trained on equal numbers of abnormal and normal images (n=2000). Literature studies demonstrate that model calibration is performed, when the model is trained on an imbalanced dataset, to rescale the probabilities to reflect the true likelihood of occurrence of the samples of a given class. However, I experimented to see if calibration impacts performance on a model trained on balanced dataset. The model, though, trained on a balanced dataset, underpredicted the positive class as observed from the below figure as the uncalibrated outputs were lying above the y=x diagonal. I observed that on applying various calibration methods, the expected calibration error (ECE) decreased compared to that obtained with the non-calibrated output (Non-calibrated ECE:0.039 vs. Platt calibrated output: 0.02). Also, the calibrated outputs closely followed the y=x diagonal. After calibration, the precision, Kappa, F-score, and MCC metrics increased compared to that obtained by the uncalibrated outputs.

I have the following questions:

  • Since the classes are balanced, the model is not prone to biased learning. How does a calibration error occur in a model trained on a balanced dataset?
  • Is the calibration error occurring due to the model's capacity toward learning the positive and negative samples?
  • Is it always necessary to calibrate a model trained on balanced data?
  • More Sivaramakrishnan Rajaraman's questions See All
    Similar questions and discussions