In a regression with a database with N=1200, I have an independent dummy variable that measures if the surveyed is unemployed or employed. The variable has the following characteristics:
Unemployment = 0 - Frecuency: 1196
Unemployment = 1 - Frecuency : 4
The regression gives me a significant coefficient, but, also, very counter intuitive (especifically, thay Life Satisfaction has a possitve association with unemployment). I think, however, that it's wrong to obtain a valid conclusion from just 4 cases in Unemployment=1. I also have other dummy variables where the situation is even less clear. For example:
Dummy = 0 - Frecuency: 1170
Dummy = 1 - Frecuency: 30
Or even more:
Categorical option A = 0 - Frecuency: 1150
Categorical option B = 1 - Frecuency: 30
Categorical option C = 2 - Frecuency: 12
Cateogorical optio D = 3 - Frecuency: 8
Can I obtain valid conlcusions from this? And, in more general terms, is there a minimun number of observations needed per category of response in each independent variable so the conslusions that arise from it are pertinent/correct? If that's the case, how can I calculate this number?