Dear RG community

I've coded N = 500 professional development courses for teachers according to topics (0 = was not part of the course; 1 = was part of the course). I'd like to have the reliability of my coding checked by a second rater. What is the appropriate measure under these circumstances and how many of the 500 courses would a second rater have to rate?

So far, I've come to the conclusion that Cohen's Kappa may not be the preferred choice, but rather Matthews Correlation Coefficient (MCC). Perhaps even simple percent agreement would be suitable in my case since it's only two raters in total and binary coding? I've been unable to find anything on the minimum sample size.

Any help is greatly appreciated.

Best

Marcel

More Marcel Grieger's questions See All
Similar questions and discussions