I have read that Fleiss´ kappa measures multi-rater agreement among any constant number of raters (≥3) when different subjects are rated by different subset of raters. However, in our study the same 3 raters classify all the subjects. There are published studies with our same design and they use Fleiss kappa, but I´m not sure that it´s correct or is better to use Light´s kappa (compute Cohen´s kappa for each coder pair then averaged manually). I´ll appreciate any advice, thank you.

Similar questions and discussions