I've got one task A and one task B that are supposed to measure the same thing (participants are either left-handed or right-handed) but 11 participants out of 38 do not show the same result depending on the task I use. So almost 30% of the participants in my sample do not show reliable behavior depending on the task. I need to give statistics in order to demonstrate that these variations really mean something about the reliability of the tasks. How can I do that?