I have to compare two ICC calculated by the same three judges on the same sample of subjects (presence/absence of a certain physical dysfunction) after and before a training to improve the judges' reliability.
Does any formal testing approach exist?
Thank you