Dear colleagues,
I have been in understanding that IRT estimators of reliability are (somehow) more accurate than the classical estimators. As a surprise, after simple simulations with my colleague, we found out that this is not the case. As the classical estimators of reliability such as coefficients alpha, theta, omega and maximal reliability understimate reliability with easy and difficult tests, also such IRT estimators as marginal reliability, empirical reliability, person separation index, and measure of accuracy are prone to underestimation in the case that the test difficulty does not match with the (average) ability level of the test takers. That is, if the test is easy or difficult, the reliability estimates may be radically deflated. In some cases even below zero.
Be kind, and check the linked ms/preprint. If you get some ideas whether something is done incorrectly be free to inform. If no obvious mistakes are found, the question is: what is the reason for this deflation and how to correct it?. We have some ideas but it would be nice to hear your voice. Preprint Deflation-corrected estimators of reliability as a part of a...