How to deal with chance of guessing in logistic regression?

03 March 2013 1 7K Report

I have the following situation: A number of texts have been rated for readability. The research question is: Is the used readability score valid? We did a study where participants first read the text and then answer a multple choice question. Because the choices were exclusive (exactly one correct choice), there is a 25% chance of guessing.

We ran a logistic regression with correct/incorrect as response and readability score as predictor. We find the desired relationship, but we also want to predict performance at texts of lower readability than in our sample. How can we formulate the model or adjust the estimates such that predicted probability of correct answer would never fall below 25% ?

Note that the data analysis is done by a bachelor student. Pragmatic solutions, using standard software, are preferred over profound. Also note that we have to use either a mixed-effects model or GEE to account for repeated measures.

Michael E Young

Issue #1: "we also want to predict performance at texts of lower readability than in our sample." A simple reminder that extrapolation is always dangerous.

Issue #2: You have to allow the model to produce the full range out possibilities from predicting all 0s to all 1s. It's possible that under some conditions people would actually produce performance below 25% (e.g., if the subject was wired backward such that wrong answers looked more correct than right answers).

I'm guessing that the question arose because in trying to extrapolate (Issue #1), you generated predicted values below chance which isn't what you expected. Well, that's the trouble with extrapolation!

So, the real answer is not statistical - be sure that your data set includes texts of lower readability to ensure that the model fits these conditions and thus your prediction doesn't require extrapolation.

A completely different answer is to aggregate data to create a percentage score and then convert the score to a d' using signal detection theory where 25% = d' of zero. See Hacker & Ratcliff (1979) in Perception and Psychophysics. The drawback is that aggregation loses information, but I think this matches your conceptualization of the problem. It may not help the extrapolation issue, however....

Badges
Science topic

More Martin Schmettow's questions See All

Has anybody ever been able to fit a gamma mixed effects model to response time data?

Response time (RT) measures are ubiquitous in experimental psychology. It is well-known that RT rarely fulfills the assumptions of ordinary linear models: the residual distribution typically is...

02 March 2013 3,168 8 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

I am using unit level data (IHDS round 2) & Stata 17

06 August 2024 5,725 2 View

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

I am working on Abaqus/Explicit(Quasistatic ) for the deformation of the auxetic structure model. Please explain how the plastic input value should be considered from the true stress-strain curve...

05 August 2024 454 3 View

"A Markov-like Model for Patient Progression"?

A Markov-like Model for Patient Progression" Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC) is a powerful computational technique used to draw samples from a probability...

05 August 2024 10,079 0 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Request a single Lecture notes for math as detailed as this that I can find in one place?

- The Existence/Uniqueness of Solutions to Higher Order Linear Differential Equations - Higher Order Homogenous Differential Equations - Wronskian Determinants of $n$ Functions - Wronskian...

03 August 2024 2,366 0 View

Is it necessary to covary exogenous constructs in a structural model?

I am working on a SEM model where i have 7 latent variables (6 exogenous and 1 endogenous). In AMOS when I co-vary the exogenous constructs, only 2 paths are coming significant out of 6. But when...

03 August 2024 6,028 4 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

Normality assumption for linear regression is The assumption of normality is whether for residual errors or predictor variavble?

When we conduct linear regression, there are several assumptions. The assumption of normality is whether the residual errors are normally distributed, not whether a predictor is normal?

31 July 2024 6,164 3 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View