Hello everyone, I'm seeking some advice or references related to the optimal number of observations needed per category within a categorical variable for machine learning projects. I've come across a rule of thumb suggesting that a minimum of 20 observations per category is advisable. However, I'm curious about the community's views on this and whether there's any literature or research that could provide more detailed guidance or confirm this rule. Any insights or recommendations for readings on this topic would be greatly appreciated. Thank you!

More Guillermo Palchik's questions See All
Similar questions and discussions