If a qualitative variable has m categories, introduce only (m − 1) dummy variables. for example, if we have three qualitative variables, we introduce only two dummies. If you do not follow this rule, you will fall into what is called the dummy variable trap, that is, the situation of perfect collinearity or perfect multicollinearity.@Mohammed Qadoury Abed
Abdur Rahman I am not sure about the fact of removing one level of categorical variables because for the regression you have the system of reference thus you lose some information, for example, if your variable is the age group your levels are 20-30 31-40 41-50 51-60 61-70 your reference is the odd ratio 20-30 each group of age will be compared if you remove the 61-71 you did not have this information. For my ongoing research project, to handle the multicollinearity, I will prefer to use classification gradient boosting decision tree models (if these models are difficult to interpret I used a SHAP diagram)
Suppose you need to estimate m coefficients in a linear regression model. The sum of all real variables and the dummy variables you have entered (m-1 plus the free coefficient) cannot be greater than the number of your observations, since you will be solving a system of linear equations when estimating the coefficients.
Exact equality is also undesirable, because in this case (assuming linear independence of rows and columns) you will get a single solution. Thus you must have at least one more observation than m.
If this sum is greater than the number of your observations, it means that the amount of data does not allow you to build such a detailed model.