suppose we have 3*2 contingency table (test of independence). if one of the row has complete zero observation, can we omit that cell and proceed analysis for 2*2 contingency? or what can we do?
A zero row value will make the row expected values to be zero. This means we will have an expected count of less than five which is not acceptable. These zero row cells may however be combined with nonzero cells for the use of chisquare test for independence.
If, after dropping values which have no cases (as correctly suggested by Ette Etuk ), a contingency table still has too few cases (e.g., expected cell frequencies fall below the usual recommended minimum value of 5), then you may:
1. collect more data; or
2. try a Fisher exact test;
Of course, if small sample size is really the culprit underlying empty cells and small expected cell frequencies, please do recognize that the statistical power of the chi-square test (or any other coparable test, for that matter) will not be very high.
Assuming you are not just trying to find a p value, but actually interested in estimating an effect, and effects involving that row or column, then one approaching is to adding a starting value (think of this as a prior, or like Yates' correction). This is of course assuming these are sampling zeroes. If you just delete the row or column, and do things like use the chi-sq to calculate the phi effect size, then this would be the effect size for the 2x2, perhaps not what you are interested in. Of course the choice of the starting value/prior is tough, but that's stats. However, if all you are interested in is the pvalue, yea, just delete.
On collecting more data after you have started the analysis (David Morse
's suggestion), be careful if you are planning on doing a CHisq test unless you have pre-registered that stopping rule.
You might describe a little bit more about your data and research questions.
thank you Ette,David and Daniel for the suggestion.
What i came to ubderstand is if all cell observed value iz complete zero . and if we are looking for the association between attributes, say for p value we can then onit the zero rows or columns and analyse the data with df lower than its orginals?