In a study, we conducted semi-structured interviews with people living with HIV and then analysed their transcription logs thematically by using inductive content analysis methodology. The approach is recommended if former knowledge is inadequate or fragmented about a phenomenon. Accordingly, the explanatory or thematic categories are derived from the data. That is to say that we did not have pre-determined codes; we derived the thematic framework from the data.
We sent our study report to a journal recently. The reviewers have required us to report how we ensured the inter-rater reliability. In the analysis phase, two researchers performed the open coding individually and then compared their findings. After iterative discussions, they reconciled their thematic sets consensually. However, they did not calculate the level of agreement between each other statistically.
I would be glad if you could share your opinions and knowledge on calculating inter-rater reliability (Cohen's Kappa) in qualitative inquiry such as thematic, content and discourse analyses or grounded theory.
Thanks in advance.