I am conducting a quantitative content analysis of party manifestos (and other sources). I don’t have a team of coders so I need to code everything on my own. I found a student coding about 10% of my text material according to the same category I use.
My unit of measurement is the single sentence. I test inter-coder reliability for about 2500 sentences but only about 1% of them fit into my category. The rest is coded as 0 (does not fit). Thus, percent agreement is almost 100 because only two sentences of 2500 are coded differently by the student and me.
Since nearly all coded sentences fit into the same category (0) Krippendorff's alpha or Cohen’s Kappa provide no meaningful results. So what can I do? I tried to calculate inter-coder reliability by comparing only the sentences fitting into the category (1) but here the problem is obviously the same. I now tend to mention only the percent agreement of the coded sentences fitting into the category (1) but since it is often argued that this is not “enough” I am kind of helpless..