Dear colleagues,

I am trying to implement a deep neural network model for regression prediction and I have some categorical features with more than 100 classes ( in some cases even more than 900 classes). the approach I have in mind is as follows:

First, I will reduce the number of levels by combining the levels which their frequency is less than 5% of total observation. This way, I reduce the number of classes to less than 100. Then, I will use dummy coding for the remaining classes.

Now my questions are:

1- What do you think about this approach? do you have another suggestion?

2- How should I code the aforementioned approach in Python?

Similar questions and discussions