I have designed 16 different mid-air haptic icons, and in an identification study I had participants guess what metaphors from a list each of the icons represented. This gave a percentage of the the participants success in identifying the correct metaphors through the haptic icon. I now want to know if there is a correlation between the type of icons and the participants' identification scores. Icons can be classified on a continuum between representational and abstract, with semi-abstract lying in between. In order to classify my icon designs on this continuum, 3 raters gave each of the icons a score between 1 to 5 relating to the continuum (i.e. 1 = abstract, 3 = semi-abstract, 5 = representational).
My question twofold:
- Can I accept the mode rating between the 3 raters as "true" (percentage agreement 66%) or do I need to find consensus agreement by revisiting the definitions with the raters?
- Also to take chance into account, how should I calculate the Kappa value (in SPSS) and what value would be acceptable?