I would like to discuss the following idea about the assessment of test-retest-reliability:

Imagine that we conducted a study (n=100) with 20 items (it1, it2, …, it20) and the questionnaire was applied two times (e.g. 3 weeks after the initial application). We calculated the test-retest-reliability and it was r=.79. Now imagine that we found, the test-retest-reliability significantly increases if we delete item it14 from the scale, for example, r(w/o it14)=.89.

Are we allowed to delete it14 from the further analysis because it decreases the test-retest-reliability? Is it a valid way to develop reliable scales? Do we violate any assumption of reliability? Do you have any evidence from the literature that confirms the idea?

Thanks for sharing your opinions.

Similar questions and discussions