I want to develop a multiple linear regression with 457 samples. Among independent variables, there is one categorical variable which consists of three level (XL1, XL2 AND XL3). The number of XL1, XL2 and XL3 are 229, 214 and 14 respectively. Looking at proportion levels, XL3 is only 0.030 which can be neglected or at least merged with other levels.
Is any statistical test which can be decide whether the number of level of a design variable could be reduced?.
Any way, I have reviewed one paper which stated the hypothesis as below:
H0 : pi = 0
H1: pi ≠ 0
where pi is the proportion of class i, with 0.05 significance level
The paper claimed that if pi < 0.05, then class i can be reduced from a design variable.
I wonder how it can be since pi value fall in alternative H1 which is pi ≠ 0. Is it correct approach? If yes, please provide your explanation.