Alternatives of Fisher's exact test for more than 2 groups?

01 January 2014 40 7K Report

I am doing a chi square test on a 3X3 contingency table. However, there are some cells with expected value

Patrick A Green Popular answer

you can do it in SPSS, if you go to crosstabs->Exact-->then click the exact box you get the Fisher's Exact result in the stats box.

Richard David Gill

There is a "Fisher exact test" for a general r x c contingency table. Compare the observed value of the usual chi-square statistic not to an asymptotic chi-square distribution but to the exact permutation distribution of the chi-square statistic. That means: you allocate your N observations to the r.c cells in all possible ways such that the row and column margins are constant and equal to those actually observed. However, usually there are far too many permutations to enumerate explicitly. Instead we take a large random sample of permutations. For instance, this can be done using the R package "coin". See http://www.statmethods.net/stats/resampling.html, in particular the section

Independence in Contingency Tables

# Independence in 2-way Contingency Table based on

# 9999 Monte-Carlo resamplings. A and B are factors.

library(coin)

chisq_test(A~B, data=mydata,

distribution=approximate(B=9999))

Richard David Gill

You can find several more ways to do this in R by following this thread in the R help archives:

https://stat.ethz.ch/pipermail/r-help/2008-September/174032.html

Topic: [R] Exact test in nxm contingency table

I am trying to find a permutation test that works on a general nxm table. The data set is small enough to have cells with too small counts to make chi2-approximation invalid. If the table was a 2x2 contingency table I would like to use a Fisher exact test (fisher.test) but that won't work in this general table. Does there exist a general function for this test?

Here are two answers:

Søren Højsgaard wrote:

Using r2dtable() you can simulate general tables nxm with given margins. Based on these you acn calculate a Monte Carlo p-value for a conditional test for independence.

Peter Dalgaard wrote:

fisher.test(...., simulate.p.value=TRUE) might be more direct. Also

works for chisq.test().

And, contrary to popular belief, fisher.test() does work for larger

than2x2 tables, although you may run into space/time limitations.

Loretta J Stalans

Optimal data analysis is ideal for small samples, and provides alternatives for chi-square. See APA book with software: Yarnold and Solystik (2005). Optimal data analysis. American Psychological Association. Dr. Paul Yarnold also has a journal and blog at http://odajournal.com/.

Ronán Michael Conroy

A 3x3 table often tests a rather vague hypothesis - "there is some kind of relationship between…". Are either of your variables ordered? If so, then the analysis you propose does not take this into account. And if both variables are categorical, do you have a more precise hypothesis?

For example, if one variable is smoking, coded as -Never, -Ex and -Current, then you can think of this as two binary variables: whether the person ever smoked and whether they smoke now. Likewise, marital status can often be rewritten as two binary variables: whether the person ever married, and whether they are still married.

It's useful to think about variables with more than two categories to see if you can find a simpler underlying structure of binary variables that will allow you to specify and test more precise hypotheses.

Peter Moono

Thanks Richard. Sounds like the right way to go

D. I. Matthews

Yes...

Using Exact or Monte carlo methods as shown in other posts. These methods will overcome any problems with assumptions in chi squared independence test and give accurate estimates of significance.

If using SPSS and the Exact modiule installed you can easily compute the answer..

(or StatExact).

Maybe ok using the simpler chisquared test in most cases!

Note even thought you have some values with expected values < 5 it is still likely that your result using Asytopic values shown in chi squared will be ok. It depends on how many cells have expected values below 2 or so.

The exact test is the best but is really best to use if the marginal counts are small.

Richard David Gill

@Ronan: good point. If the amount of data is rather small there is little point in performing an omnibus test for independence. It has a tiny amount of power against every conceivable alternative. Probably some alternatives are more plausible/interesting/important than others. So indeed one should think of replacing the chi-squared statistic with something focussed on particular kinds of alternatives. This still leaves you free to use the permutation approach to determine significance.

David R Bristol

Fisher's exact test is not "exact" in the sense of a permutation test, or enumeration. In a sense, it is a misnomer. For comparison of the proportions of success in two groups, there are two unknown parameters, namely the two success probabilities. This can be re-parametrized into the difference and one success probability, which is a nuisance parameter. Conditioning on the estimate of this nuisance parameter (and essentially fixing the marginals) results in Fisher's exact test, based on the hypergeometric distribution. For r by c, see Mehta and Patel (JASA, 1983) and www.cytel.com.

Richard David Gill

So Fisher's exact test is an exact test in the same sense that a permutation test is exact. Moreover it's the permutation test based on all permutations leaving the margins fixed. Under the null hypothesis and conditional on the margins, the distribution of the data is uniform over all feasible permutations. Let's call it an exact conditional permutation test.

Richard David Gill

@Brian Altonen: you are saying that by multiplying the observed cell counts by 2, 3, 4, ... the chi-square p-value will decrease then plateau. However multiplying all cell counts by N multiplies the statistic by N and takes you further and further into the tail of a fixed chi-square distribution as N increases without limit, if you are referring you statistic to its large sample approximate null distribution. So the phenomenon whih you observe must be a small sample phenomenon due to the discreteness and conservatism of the Fisher exact test. And whether the p-values you see when you have artificially inflated the sample size by some arbitrary factor N=2 or 3 or 4 have any statistical meaning at all, is not clear to me.

Richard David Gill

Everyone is familiar with the usual rule of thumb "asymptotic chi-square approximation is adequate if expected number of observations per cell is at least 5". I recall work by Albert Verbeek which showed that one can be much liberal. As long as the number of cells with small expected cell count is small compared to the rest, or something like that, things are not so bad. Unfortunately it seems this work never got published. I need to ask his former collaborators what came of this work.

Wing Fai Yeung

Thanks so much for all your detail explanation. Actually, I performed the 3X3 table to test between-group difference at baseline in an RCT (3 gp, 3 cat.) using SPSS. Prof. Gill mentioned that Fisher's exact test works in table larer than 2X2, but SPSS just can't do that. Is there any package of SPSS for doing that? or I can just go with Chi-sqaure? Thanks

Rocio Hassan

Hi, I've enjoyed very much the theoretical considerations!

In the Vassar University stats online resource there is something you can use to explore your associations:

http://vassarstats.net/

Go to Procedures Applicable to Categorical Frequency Data /

Fisher Exact Probability Test for Tables Larger than 2x2

2x3 2x4 3x3

Richard David Gill

@Wing Fai Yeung: do it with R. Learn R! Small investment now, big payoff later.

Richard David Gill

@Rocio Hassan: thanks for the link, interesting site, interesting methods. I saw the rule of thumb I was looking for: "at least 80% of the cells have an expected frequency of 5 or greater, and that no cell has an expected frequency smaller than 1.0".

The usual rule of thumb "all expected frequencies at least 5" is unnecessarily strict.

Wing Fai Yeung

@Rocio Hassan, thanks for the link, it is very helpful

@Richard Gill, it seems that SPSS alone is not enough for my statistic analysis. I think I should explore R. Thanks you for your suggestion~!

Patrick A Green

you can do it in SPSS, if you go to crosstabs->Exact-->then click the exact box you get the Fisher's Exact result in the stats box.

Pramesh Koju

Sorry i am clear with some of the things but can i look for the chi square test between different generic drugs which has 3 different generics with promotional activities which has 10 different promotions. It shows more than 70% cellls less than 5. Please how can i solve and how to interprete it.

Peter Moono

Hi Wing, you might find this paper useful too"Using Lancaster's Mid-P correction to the Fisher's exact test for the adverse impact analyse" Dan A. Biddle J. Appl Psychol 2011, Vol. 96, No. 5 956-965

Jess Thomson

Patrick A Green, thank you so much! :D

Saruna Ghimire

Fisher-Freeman-Halton test, an extension of the Fisher exact can be applied for contingency tables that are not 2x2. This link provides a way to do it in SPSS.

http://www-01.ibm.com/support/docview.wss?uid=swg21479647

Nahid Moradi

this is exactly the question I have. I've been searching internet extensively and what I've found is kind of conflicted. Some suggest the original Fisher's exact test( which is the one spss calculates through crosstabs) can be extended to r*c tables and the reason it was not originally inclined by Fisher himself, is that computation becomes difficult to near impossible by hand, but computers are capable of doing it.

some others suggest some kind of extension just like what has been mentioned above. I can not decide which one is correct or at least better to adhere to? some extensions like Freeman-Halton are definitely alien to many researchers and readers of medical journals at least.

Carlos Quiroz Dahik

Thank you Rocio Hassan your link helped me a lot.

Tomio Andoh

I have a question. How do I calculate the sanple size needed for 2x3 Fisher Exact test?

Is there any free software to calculate it?

Dr. Senthilvel Vasudevan

Dear Tomio Andoh,

Good Morning,

Your question is very much interesting and it is a wrong one also. If your data (2 x 3) table is having > 5 in all the cells then you have to find the association between the variables by using Chi-Square test only.

If your data (2 x 3) table is having 5 5 in any one cells then you have to go and find the association between the variables by using Fisher's Exact test method.

There is no formula for find the sample size needed of 2 x 3 Fisher's Exact Test.

Refer the following links:

1. http://www.biostathandbook.com/chiind.html

2. http://www.biostathandbook.com/fishers.html

Peter Moono

Not sure how you want to report your data. You may have a look at this website if you are comparing means.

https://brownmath.com/stat/anova1.htm

Jaume Sastre Tomas

I have a small dataset (N=400 samples). And I want to apply Fisher exact test because I my contingency table has a lot of 0s but I have read that Fisher Exact test is only for N≤90 so I returned to Chi2-test but I read that the chi-square test is performed only if at least 80% of the cells have an expected frequency of 5 or greater, and no cell has an expected frequency smaller than 1.0, which is not my case.

Which test do I have to use?

Any response might be fine!

Thanks in advance.

Nahid Moradi

Dear jaume, I believe the best course of action is collapsing your table into fewer rows and tables. In many medical scenarios at least, collapsing the table is sensible and logical.

Hesam Sharifi

Merge some rows or columns to achieve frequency of 1 and more for desire variables.

If you don't want to do that, you may use the alternative test for fisher exact test called Fisher-Freeman-Halton.

You can search the internet about this test.

Jaume Sastre Tomas

Thanks Hassan, do you know if there is an implementation of the Fisher-Freeman-Halton test?

Pieter Marinus Kroonenberg

Please have a look at my recent paper on this issue. It provides a numerical answer to question posed here. Article :. The Tale of Cochran's Rule: My Contingency Table has so Many Expected Values Smaller than 5, What Am I to Do? https://amstat.tandfonline.com/doi/abs/10.1080/00031305.2017.1286260

Abstract: In an informal way, some dilemmas in connection with hypothesis testing in contingency tables are discussed. The body of the paper concerns the numerical evaluation of Cochran's Rule about the minimum expected value in r×c contingency tables with fixed margins when testing independence with Pearson's X2 statistic using the chi-squared distribution.

Bince Varghese

Can you please tell the difference between Pearson chi square, continuity correction and fisher extract... In this case what test I can use...

Jithin Thomas Parel

Dear Mr.Bince

Thank you for your excellent question.

Both pearsons chisquare test and fisher exact test is a non -parametric test look for the associations dichotomous categorical varaible.However the general statistical rule of thumb is for chi-square test,in a 2x2 contingency table atleast 5 observations.

But if one of the observations in 2x2 contigency table is less than 5,then you must go for fisher exact test.

As it can be seen from your image attached,it has been clearly mentioned as 2 cells have expected count less than 5.Then in that case you must go for fisher exact test.

Bince Varghese

Thq Jithin sir

Gabriel Ling Hoh Teck

Monte Carlo will do...

Anas Musah

Hello everyone, I am also having similar issue. Please can I get example code in python or a package to compute fisher's exact for more than 2x2 contingency table. Attached is my the table!

Pieter Marinus Kroonenberg

asymptotic p = .0019

exact p = .0017

residuals:

5.0 0.5 -5.5

-5.0 7.0 -2.0

0.0 -7.5 7.5

Patrick Morcillo

A 3X3 Fisher exact test calculator

https://www.danielsoper.com/statcalc/calculator.aspx?id=59

Mohammed Jasim Mohammed

A Fisher's exact test for more than 2*2 table.

https://www.jstor.org/stable/2288652?origin=crossref&seq=1#page_scan_tab_contents .

Badges
Science topic

Similar topics
Mathematics
Statistics

More Wing Fai Yeung's questions See All

How to perform group-time interaction in change score using linear mixed-effects model?

We are going to do a mixed-effect model to examine the time-group interaction Study design: A pilot 2-arm, parallel RCT, 18 sbj in each group. 2 Time points: Baseline, Post-tx Results:...

08 September 2017 5,888 3 View

How to adjust for baseline measure in mixed effect model using SPSS?

I am doing a mixed effect linear model using SPSS for my collaborated project and need your advice. The study is an RCT with 2 groups and 6 time points. We would like to test a 2 X 6 (group X...

04 May 2016 1,458 4 View

Where can I find a reference and source of Treatment Emergent Symptoms Scale (TESS)?

I am looking for the reference and source of Treatment Emergent Symptoms Scale (TESS). It seems that the checklist has been used for a long time, but I just found many authors used this scale...

02 March 2015 2,869 2 View

Any website for randomization and concealing?

I am going to do a double-blind RCT. I usually use computer program to generate randomization number and opaque envelopes to conceal the treatment allocation. I just learnt that some online...

04 May 2014 3,311 0 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which test should be used to study association among demographic profile and awarness level?

i have to study the awareness and adoption level of cloud computing in a district of India. i also want to use association among demographic variables like gender, age, education, income etc and...

02 August 2024 2,420 3 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

How to do Mann-Whitney U test with Bonferroni corrected p-values?

Dear All, My lab primarily works on insect wing patterns. In one of the projects, my student and I have defined 19 abnormality characters on the forewing and 6 abnormality characters on the...

31 July 2024 6,464 5 View

How to back transform the results generated from analyses using log transformed with In(X+1) data?

I am conducting my analysis using SPSS. I log transformed my data using In(X+1) as my data contain zero values. However, when I want to back transform the regression coefficients generated from my...

31 July 2024 7,860 3 View

Bonferroni correction. I have independent t-test, paired t-test and ancova conducted. Which test would require Bonferroni adjustment?

I have two groups that I test on three different tasks. I have 4 independent t-test, 6 paired t-test and 2 ANCOVA. My concern for which t-test should I conduct bonferroni correction. At the moment...

28 July 2024 7,827 6 View

Can I use Likert scale with Paired Sample T-test?

Hey researchers! I am currently doing a research about to what extent in which A has accelerated the inclusiveness of the payment system in my country. Down below are a few examples of the...

26 July 2024 5,654 3 View

Have you tried using Vizly for your data analysis? Use the link: https://vizly.fyi/?via=olatomide. How do you see it?

AI has made it easier to code and analyze data

25 July 2024 9,861 1 View

Paired t-test or unpaired t-test for my quantitative data with SPSS?

I am conducting a qualitative-driven approach to mixed-method research. The role of my quantitative data is to corroborate the findings of the qualitative data. Qualitative data has been collected...

24 July 2024 9,799 3 View