03 November 2017 3 10K Report

I'm running a logistic regression with about 200k observations, in which there is one binary predictor where out of the 200k observations there is only 4 occurrence of "1". And the coefficient I got for that predictor is sensitive to the epsilon value I choose.

For the following code I've tested epsilon = 1e-6, 1e-8, 1e=9 ... and they all give me completely different coefficient in value and in sign.

The coefficient converges for all the other predictors.

I'm guessing R uses a optimization algorithm to solve the glm and does that that the algorithm does not converge?

What is the treatment for that?

m3_new = glm(formula = input_formula,

family = binomial(link = 'logit'),

data = dat_train, trace = TRUE, epsilon = 1e-8)

More Chenying Gao's questions See All
Similar questions and discussions