I have a dataset of approximately 170 CT cases. The idea is that the gold standard is the consensus evaluation of two radiologists on 12 descriptive parameters and 1 conclusion. Is it conceivable that, because 170x2 readings are quite demanding, I test inter-rater agreement on a portion of the cases (like 40?) and that the remaining 130 cases are randomly read by one of the two readers if the Kappa on the 40 cases is >, say, 0.7? In this way, each of the two readers would read 40+(130/2)=105 cases instead of 170.
Is this a possible shortcut? Thanks a lot