I'm running a logistic regression with about 200k observations, in which there is one binary predictor where out of the 200k observations there is only 4 occurrence of "1". And the coefficient I got for that predictor is sensitive to the epsilon value I choose.
For the following code I've tested epsilon = 1e-6, 1e-8, 1e=9 ... and they all give me completely different coefficient in value and in sign.
The coefficient converges for all the other predictors.
I'm guessing R uses a optimization algorithm to solve the glm and does that that the algorithm does not converge?
What is the treatment for that?
m3_new = glm(formula = input_formula,
family = binomial(link = 'logit'),
data = dat_train, trace = TRUE, epsilon = 1e-8)