Deal fellow researchers:
I am looking for references to explain how to interpret if the Cohen's kappa statistics is different on training and validation datasets. Some of the machine learning models had almost similar Kappa-value, however, some of them deviated a lot.
I will appreciate for sharing published works on this issue.
Thank you and best regards,
Sushant