03 March 2017 4 5K Report

I have a potential problem that I may have to answer for within one of my working research projects. I have a variable that I am studying its effect on mortality of cancer patients. This variable was distributed over many groups with a wide range of group sizes. For example, Group A had 10,000 observations, Group B had 7,000, Group C had 4,000, Group D had 5,000, and Group E had just 100 observations. I am thinking of excluding Group E because of its extremely small sample size that might affect the statistical significance my results. Am I statistically allowed to do so? Is there a cut-off limit for group sizes, given the above example, that below which I can safely exclude the group with the smallest number of observations? What if the smallest group actually contained my "variable of interest" which I wanted to test its effect? Are there any statistically feasible solutions to the above problem? 

More Amr Ebied's questions See All
Similar questions and discussions