IRT estimators of reliability give too low estimates?

More Jari Metsämuuronen's questions See All

Missing formula of Cohen's f?

Dear colleagues working with effect sizes, I’m studying the relation of Cohen’s d and Cohen’s f. See the appended file of my problem. In short: Cohen f and d are strictly related in the case of...

24 January 2024 5,725 11 View

Bisphenol A in water - measuring bpa using a uv/vis spectrophotometer?

Hi, the concentration of aqueous solution of bisphenol A can be measured by using UV-VIS?

30 May 2022 9,858 0 View

What is the pollutants fate after adsorption and regeneration?

Emerging pollutants such as pesticides, pharmaceuticals, and endocrine disruptors are toxic and their presence in the aquatic environment threatens living organisms and human health. After the...

12 August 2021 4,948 11 View

Interpretation of Goodman - Kruskal Gamma squared?

Dear colleages, Eta squared is dominantly used in reflecting the explaining power the same manner as a squared partial correlation coefficient (R2p ) from multiple linear regression: as the...

22 February 2021 8,262 3 View

Measurement model and systematic mechanical error?

Hi you, the expert of measurement modeling! The classical measurement model that is used in estimating reliability by, for example, maximal reliability, do not include any element for the...

15 September 2020 2,359 3 View

In the Hall effect, is it important to measure the time from when the electrons begin to deviate and when they no longer deflect? is it measured?

I was talking about the subject because he makes first principles about the hall effect and that question arose. Thank you, I hope you get me out of the doubt.

25 June 2020 425 9 View

What would be validated instrument for measuring computational thinking in the context of higher education?

I want to run longtitudinal measures about computational thinking in the context of higher education. Could you suggest validated instruments for doing this kind of data collection in the...

26 February 2019 9,018 2 View

Is there healthy food and bad food?

In my opinion there is no good food and bad food The food itself is not bad or good but the food choices and style of eating is what makes it bad or good. The amount of food eaten is the standard....

14 February 2019 2,566 84 View

Does excess weight affect your personality or lifestyle?

Excess weight is a disease that affects the activity of the human and determines the movement and causes him many health problems. Did you reach such a situation or problem?

06 February 2019 5,047 92 View

Have you encountered such a situation? Your score is less than the week before in the researcher gate ?

I have decreased my score despite interacting with friends, answering questions, asking questions and publishing an article

01 February 2019 6,350 12 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Should I remove an item from a scale to raise Cronbach's alpha and McDonald's omega or is it better to leave it if they are both over .7 already?

Hello! I have this scale which had 10 items initially. I had to remove items 8 and 10 because they correlated negatively with the scale, and then I removed item 9 because Cronbach's alpha and...

01 August 2024 4,606 7 View

What do you consider to be the most relevant elements of EEG for studying cognitive biases?

I've seen articles that primarily focus on alpha and beta activity in the frontal regions, but these studies often compare healthy subjects with those having various pathologies. I haven't seen a...

31 July 2024 7,259 1 View

Dimensions of an MJ Research 96-well alpha module?

Can anyone with an MJ research / BioRad PCR machine from ~2010 or earlier tell me the external measurements (LxWxH) of the removable standard PCR alpha modules that can be removed from a PTC200 or...

30 July 2024 2,867 0 View

Ti6Al4V - Phase differentiation between alpha and alpha prime martensite?

Dear Researchers, Is it feasible to distinguish between alpha and alpha prime martensite in the Ti6Al4V alloy under rapid solidification in the as-cast state using XRD, in addition to...

30 July 2024 3,849 0 View

I need a reliable source or an example supported by excel sheet to understand Fuzzy Vikor?

27 July 2024 5,916 1 View

Why only alpha wave of brain are found in some patients?

I have found an EEG where only alpha waves are present. Beta waves are not found in active patients. What interpretations ?

26 July 2024 4,741 1 View

Is a reliability test necessary in my survey on translations?

Dear all, I gave 116 respondents 18 translated sentences and asked them to indicate their levels of acceptance of these translations on a five-point scale. Some translations result from strategies...

24 July 2024 8,245 5 View

Can you suggest reliable procedures to get displacements from accelerations in frequency domain ?

I have identified many solutions. I need suggestion from somebody with application experience of this topic to identify the most reliable and robust procedure.

21 July 2024 3,465 5 View

Convergence criteria not met but ends with normal termination. Could not open the log file in gaussview. How to rectify this problem?

>>>>>>>>>> Convergence criterion not met. SCF Done: E(RB3LYP) = -472.437545326 A.U. after 33 cycles NFock= 32 Conv=0.29D-07 -V/T= 2.2914...

16 July 2024 8,862 2 View

Tine Nielsen

are you sure that the reason it as stated here - maybe not everybody check for local dependence and items, maybe some adjusts for this and other not. Local dependence will falsely inflate reliability .... just a thought

Jari Metsämuuronen

Thank you Tine Nielsen for the reply! That's also a relevant point in reliability estimation. In our simulation, we deliberately simulate the case where the test difficulty does not meet the test takers' achievement level, and these are independent. A kind of situation that a test that would discriminate highly of grade 3 pupils is given to grade 4 or 5 students. Too easy (or difficult) test leads to deflated reliability estimates. In practice, thus, even if the test score would be accurate, it cannot discriminate the test takers. Then, is the test poor as indicated by the low reliability? Or is the test good, but the low reliability is caused by technical reasons as in the traditional estimators (alpha, theta, omega, rho)? In the traditional estimators, the reason is the poor behavior of Rit which gives deflated estimates of correlation. The phenomenon seems the same in IRT estimators. What could be the reason ? Item-wise information function depends of normality? Normality baked to the estimators?

sure. "A kind of situation that a test that would discriminate highly of grade 3 pupils is given to grade 4 or 5 students. Too easy (or difficult) test leads to deflated reliability estimates." this means that variation is very small, as test is to easy (all correct) or to difficult (all wrong) in rough terms, and thus, of course, reliability will decrease...

Tine Nielsen exactly. This is the reason for the deflation in the estimates in the classic estimators: the skewed score is an outcome of too easy or too difficult items, and the item-total correlations based on covariation (R) embedded in the classical reliability estimators are always deflated (approaching 0) with items of extreme difficulty levels regardless of the true correlation. In this kind of artificial setting where a test suitable for G3 pupils is administered to G4-5 pupils is an example of administering a very simple screening test of language, for example, to the students. In both cases, the test may accurately discriminate between the lowest performing test takers and the other test takers (medium or high performing ones), but the traditional estimators cannot detect this. The deflation-corrected estimators can detect this, i.e., if changing the embedded item-total correlation (R), e.g., to polychoric correlation (or some other robust estimator, such as D or G) which is not dependent on the variance in the score or in the items, the reliability estimates are deflation-corrected.

I was kind of hoping that IRT estimators would behave better in this respect because they are not dependent on variance per se. That was my question: what is then the mechanism in those estimators? Or is it so that the item-wise Fisher Information used in some of the estimators is somehow based or affected on variance?