For those of you with expertise in measurement and replication, what would you recommend in the following scenario:

We are preparing to replicate a between-subjects experiment where participants (N = 135) were randomly assigned to read one of 3 versions of a story (1 experimental condition and 2 control conditions) and then asked whether the protagonist in the story "knows" or "only believes" the stated claim (binary response; 3x2 chi square).

The original authors found no significant difference between the experimental condition and the "knowledge" control condition (Fisher’s p = .164) and a large significant difference between the experimental condition and the "ignorance" control condition (Fisher’s p = .001, Cramer's V = .509); but they claim that a more recent study suggests that if people are asked the same question using a more scaled response type (e.g., visual analogue scale, Likert-type), there may actually be a small real difference (d~= .1) between the experimental condition and the knowledge control that went undetected in their first study - suggesting that people might respond with more nuance given the opportunity.

Therefore, we conducted a pretest (N = 165) using a visual analogue scale (VAS; where 0 represents "only believes" and 100 represents "knows") in lieu of the original binary response format. Based on the pretest data though, it appears that participants still responded quite dichotomously (most responses were grouped near 0 or 100 in each condition, but less so in the experimental condition; see attached figure).

If this were your replication study, would you stick to a binary response type (knows/only believes) or try using a potentially more sensitive measure (VAS, Likert-type) to obtain more fine-grained data? With continuous data, we could also dichotomize the responses and calculate the non-parametric statistics in order to compare them to the parametric results, but I'm worried this may violate some statistical assumptions. Namely, wouldn't highly dichotomized data violate the assumption of normality for most parametric tests?

Similar questions and discussions