Hi and thank you to everybody.
I have data regarding a training that aimed to improve the agreement among raters. Raters gave a dichotomous judgment (presence/absence).
The study is multi-centric: several different centers recruited different number of subjects. Each subject in each center was evaluated by three judges before and after the they attended the agreement training. Judges were the same for all subjects of a certain center, but vary among centers.
I'm interested in providing evidence that the training was useful in improving the agreement.
i have two questions:
1) it appears to me that I'm in a borderline situation between ICC two-way random (every subjects is evaluated by the same random sample of judges, that is not the case) and one-way random (every subject is evaluated by random judges, that is also not the case as there are groups of subjects evaluated by the same groups of judges). is there a way to handle groups of judges instead of opting for the more "conservative" one-way random strategy?
2) Regarding the pre-post training difference, I'm considering to use a bootstrap resample approach. is it correct to resample subjects, calculate the pre-post ICC difference in every resample and consider a significant improvement if the lower 2,5th percentile of the bootstrap ICC difference distribution is higher the 0?
Thank you very much!