How can I calculate inter-rater reliability in qualitative thematic analysis?

The inter-rater reliability (IRR) is easy to calculate for qualitative research but you must outline your underlying assumptions for doing it. You should give a little bit more detail to the type of qualitative methodology you are following and why you are using IRR. According to some disciplines, schools of thought, IRR is unnecessary as the researchers themselves bring varied but valid perspectives to identify unique codes and themes in the data.

Here are some questions you should ask yourself,

1) Am I looking for generalizability of the findings beyond the sample to an entire population? (often not recommended, but if so desired it requires large sample sizes)

2) Am I looking to generalize beyond the raters to make a statement about how they interact with this specific subject? (for identifying latent traits contributing to some phenomenon or another)

3) Am I doing this because a journal/supervisor is asking for it regardless if it is helpful?

Once you have figured out the answers to these questions then you can start to choose which approach best answers your research questions from those currently in use.

Here is a link on that outlines general idea behind IRR and provides info along with additional resources on calculating the most notables, percent agreement, Holsti's method, Scott's pi (p), Cohen's kappa (k), and Krippendorff's alpha (a),

http://matthewlombard.com/reliability/

Hope that is of some help.

Adria E Navarro

Thank you Mr. Chan for asking this question. I too have interviews coded in NVivo and realized the inter-relater reliability among the three coders is not a straight forward operation. One of the coders was at a quals training and clarified with them that it could not be simply run in NVivo for the three coders.

Robert Rivers questions are helpful and I am going to the link to read some more...best wishes on your research!

Tak Mau Simon Chan

I agree that three coders is not a straight forward operation, but I am still puzzled by two coders issue.

Robert's link is really helpful. I try to respond to Robert's question here,

1) I use inter-rating for my coding because I want to increase trustworthiness, not so much to generate it. I tend to agree with the argument that IRR is easy to calculate and should be clear about the underlying assumptions. Why do you do it? If it is for generalizability, then it may be seen as a weak attempt to make a qualitative study achieve that. Qualitative study is not known for its power and purpose for that. Given you small sample size, it is even more difficult;

2) the context of my study is a qualitative research by means of focus group with 30 couples, four men groups and four women groups respectively. Two raters were invited to code the qualitative data, about gender intimacy, into various themes;

3) I am doing this for academic publication purpose, I have faced critiques from the reviewers about lacked of IRR in the manuscript.

I am studying your link but hope to have further advice.

John L. Campbell

See my recent article in Sociological Methods and Research on this topic.

Adria E Navarro

Thank you Dr. Campbell, looking forward to reviewing your paper.

Tak Mau Simon Chan

Me too, hope to get the insight from your work soon.

D. Scott Sibley

This discussion was very helpful. Thanks to each of you. Dr. Campbell I will be reading your article later today.

Imelda Abas

I need this, thank you for sharing ^_^

Shahnila Tariq

I have done thematic analysis of focus group data but the reviewer of my article asked me that why I have not used kappa for inter-rater reliability, although the themes were verified by the subject specialists too. Kindly help me with the reference that how i can defend this objection.

Alireza Nili

I know it's a little bit late to answer your question, but I'm sure many researchers are looking for an answer for the same question. Here is my recent paper that I hope it can directly answer your question. It also suggests an approach to calculate intercoder reliability:

Nili, A., Tate, M., & Barros, A. (2017). A Critical Analysis of Inter-Coder Reliability Methods in Information Systems Research. Australasian Conference on Information Systems, Hobart, Australia.

The title of the paper includes the words "information systems", but actually the paper is very relevant for most business, management and other relevant fields in social sciences. We are using it for various projects here in Australia.

Aba Abraham

Thanks Nili. I will look it up

Regards

Aba.

Syahidatul Khafizah Mohd Hajaraih

Thank you for the resource, Nili!

Shahnila Tariq

Thank you for the resource Nili. I will read it and will discuss it with you.

Paul M. Camic

Cohen's kappa can be used for inter rater reliability in thematic and content analysis for two raters. Here is an online tool to calculate it: http://vassarstats.net/kappa.html

For more than two raters Fleiss' kappa can be used.

In any qualitative method one needs to consider the rationale for using a statistical analysis to determine reliability. Does it fit with the underlying epistemological approach? For thematic analysis is would because TA is atheoretical and probably closest to a critical realist epistemology. More problematic with IPA, grounded theory, discourse analysis, situational analysis and others.

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

Request Python code?

How are iso-frequency contours plotted?

Why does everyone use vs code?