While building a logistic regression model, I have imbalanced binary classes in 80:20 ratio.

  • Does imbalance have any impact on the performance of a logistic model? If it depends on the severity, what's the criteria to consider a dataset as imbalanced?
  • Does it have anything to do with the fact that we are predicting probabilities and imbalance might just skew the probability distribution and the inaccuracies due to it can be controlled by choosing an appropriate threshold for cut-off?
  • Is there a function that can calculate the most optimum cut-off directly in R?
  • Thanks

    More Shahzeb Naveed's questions See All
    Similar questions and discussions