How can I measure inter-rater reliability for ordinal variables?

You can use Cohen's kappa statistic for measuring the inter-rater reliability. I am attaching a link to the Stata manual entry for kappa. It describes the method and provides examples. I think you will find this helpful.

Ariel

www.stata.com/manuals13/rkappa.pdf

Robert James McClelland

The Kappa statistic is a quality index that compares observed agreement between two raters on a nominal or ordinal scale with agreement expected by chance alone. Note: specifically for ordinal can use spearman's rho. Extensions for the case of multiple raters exist (see Siegel and Castellan below). Also in the case of ordinal data, you can use the weighted kappa, which basically reads as usual kappa with off-diagonal elements contributing to the measure of agreement. Fleiss (a) provided guidelines to interpret kappa values but these are merely rules of thumb.

The kappa statistic is asymptotically equivalent to the ICC estimated from a two-way random effects ANOVA, but significance tests and SE coming from the usual ANOVA framework are not valid anymore with binary data. It is better to use bootstrap to get confidence interval (CI). Fleiss (b) discussed the connection between weighted kappa and the intraclass correlation (ICC).

S Siegel and Jr N John Castellan. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill, Second edition, 1988.

J L Fleiss (a) Statistical Methods for Rates and Proportions. New York: Wiley, Second edition, 1981

J L Fleiss (b) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613-619, 1973.

s. Béatrice Marianne Ewalds-Kvist

In my Sidney Siegel (1956) named Nonparametric Statistics for the Behavioral Sciences, International student edition, no Kappa for ordinal scale was in the book. I took my PhD 1985

Robert James McClelland

There was a 1988 edition produced - I think one of the authors had died by its publication? (pages 284-91 start to address this topic), however some 80 years before this Spearman's rho had covered agreement between coders:

Spearman’s rank correlation coefficient rho measures the agreement between two coders’ ranking of the same set of N objects. In its original form:

Where is the sum of N differences between one coder’s rank c and the other coder’s rank k of the same object u. Whereas alpha accounts for tied ranks in terms of their frequencies for all coders, rho averages them in each individual coder's instance. In the absence of ties, rho's numerator - ND (subscript o) and rho's denominator - n/n-1 ND (subscript eta) , where n=2N, which becomes ND (with eta subscript) when sample sizes become large. So, Spearman’s rho is that special case of alpha in which two coders rank a very large set of units.

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Why does my protein refolded to beta sheet during thermal denaturation analysis?