I have 5 groups of rats (each group n=6), Group 1- healthy control rats, Group 2- Diseased control rats, Group 3- Drug A treated rats, Group 4- Drug B treated rats, Group 5- Drug C treated rats. I have to compare anxiety behavior within the groups for each treatment. Which statistical test will be appropriate for this analysis?
I hope One Way ANOVA through Tukey's Post-hoc test might be suitable for your study.
You have small (n=6) independent rat groups (n=5). Then non-parametric statistics such as Kruskal-wallis one-way analysis of variance (H-value) is a suitable test. see Sidney siegel : nonparametric statistics for the behavioral sciences.
If your groups are matched on some variables so that you can consider them dependent or related then you can use friedman two-way analysis of variance. see Sidney Siegel: nonparametric statistics for behavioral sciences pp. 166-172
If you then want to compare two related groups of rats with an n of 6 you can use
Wilcoxon's matched paires signed-ranks test see ibid. pp 75-83
I do not know which software you use. In this situation I use GraphPad Prism software and run one-way ANOVA with Newman-keuls multiple comparisons.
Nota bene You do not need software. Take the book and a paper and pencil in your hand. It is easy to calculate by hand
If you find my suggested book difficult then try this: Judith Green and Manuela D'Oliveira: Learning to use statistical tests in psychology. Three or more conditions:
I guess that your rats are matched on age, strain, sex and then I think you should use
friedman's analysis of variance for related groups. pp. 44,45,55-9. As an extension of Friedmans you can then use Page's L trend test. In this book you are guided step by step how to calculate these small groups by hand. Nevertheless I do not think you need to look for a trend in your anxiety conditions.
If your variables are continuous or not does not matter because your values are transformed to ranks with these methods. Judith Green and Manuela D'Oliveira: Learning to use statistical tests in psychology. I really recommend this book, with small rat och mice groups you save a lot of time when you calculate as oppposed to compute. Good luck!
Well this is not what I was taught in our laboratory. Paired test are used for matched groups also especially if the rats are bred in that way.
Depend of your response variable. Whether it's cuantitative or categorical. Before run a test, if it's cuantitative, check the assumptions for ANOVA models i.e. homogeneity of variances and normality of residuals. If it meet the assumptions you can run an ANOVA test and if is necessary run post hoc test. Also, if you have a priori hypotheses about the effects of the drugs, you can consider planned contrasts, these are powerful than post hoc test. On the other hand, if your response variable on´t meet the assumptions for ANOVA, consider run a GLM.
good luck!
Parametric tests demand group sizes of a minimum of n= 30! I disagree with Simon B castillo
AND I though think that you CAN benefit from the extension of Friedmans test in the form of Page's L trend test. What response to what apparatus serves as dependent variable to these groups? What do they perform under drug-non-drug conditions?Group 1- healthy control rats, Group 2- Diseased control rats, Group 3- Drug A treated rats, Group 4- Drug B treated rats, Group 5- Drug C treated rats may form a trend.
Other post-hoc analysis to Friedman: Post-hoc tests were proposed by Schaich and Hamerle (1984)[1] as well as Conover (1971, 1980)[2] in order to decide which groups are significantly different from each other, based upon the mean rank differences of the groups. These procedures are detailed in Bortz, Lienert and Boehnke (2000, pp. 275).[3]
The Page test is useful where:
there are three or more conditions,
a number of subjects (or other randomly sampled entities) are all observed in each of them, and
we predict that the observations will have a particular order.
For example, a number of subjects might each be given three trials at the same task, and we predict that performance will improve from trial to trial. A test of the significance of the trend between conditions in this situation was developed by Page (1963). More formally, the test considers the null hypothesis that, for n conditions, where mi is a measure of the central tendency of the ith condition,
against the alternative hypothesis that
It has more statistical power than the Friedman test against the alternative that there is a difference in trend. Friedman's test considers the alternative hypothesis that the central tendencies of the observations under the n conditions are different without specifying their order.
Page, E. B. (1963). "Ordered hypotheses for multiple treatments: A significance test for linear ranks". Journal of the American Statistical Association 58 (301): 216–30. doi:10.2307/2282965. JSTOR 2282965.
In case your groups would not be matched (paired) you could not draw any conclusions about treatment at all. It follows that your groups must be calculated as paired (matched rats on age, strain, housing conditions and so forth) groups: Matching is a statistical technique which is used to evaluate the effect of a treatment by comparing the treated and the non-treated units in an observational study or quasi-experiment (i.e. when the treatment is not randomly assigned). The goal of matching is, for every treated unit, to find one (or more) non-treated unit(s) with similar observable characteristics against whom the effect of the treatment can be assessed. By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment without reduced bias due to confounding
I respectfully disagree with Béatrice Ewalds-Kvist: you don't need 30 subjects or more to perform parametric statistical analysis, as long as the empiric statistical distribution complies with their distributional assumptions of those tests.
The rank-based tests assume that the tested distributions are identically shaped (i.e. the pure-shift assumption). If this assumption is not met, their results must be interpreted with extreme caution because they are excessively sensitive to skewness and heteroskedasticity. Hence, a difference in spread or skewness can potentiate a non-significant difference in position, so the null hypothesis of difference in medians is rejected in error. Also, a significant difference in position can be masked by a difference in shape as well.
Another problem of rank-based tests is that their robustness decreases with sample size (even if this problem is not as important in your case as you will work with a small number of animals per treatment group).
Fagerland MW, Sandvik L (2009). The Wilcoxon-Mann-Whitney test under scrutiny. Stat Med. 28(10):1487-97. doi: 10.1002/sim.3561.
You are right, you do not always need n=30 to use a parametric test but this is what we as a general rule teach in Finland in order to complete the assumption of normal distribution and not to confuse our students with too complicated explanations. Of course you can test for normality with Kolmogorov-Smirnovs test and then you know if your sample complies with normal distribution or not.
Béatrice, with all due respect, I believe that teaching such "general rules" impedes the students to learn how to constantly improve their statistical skills in order to keep up the pace with their increasing methodological skills. I am extremely worried of the increasing number of sophisticated biomedical studies that are tainted by poor statistical testing.
The problem with the Kolmogorof-Smirnof (K-S) test for a statistical distribution is that it has extremely low power and simulations have shown that it is unreliable unless the sample size is huge (if I remember well, around 2000 subjects). Also, some authors point that the use of these tests increases the risk of type-I error beyond the conventional 0.05 significant level. Worse, as a reviewer I often see manuscripts where authors justify their use of ANOVA on discrete data, because they performed a K-S test or a Shapiro-Wilks test that failed to reject the hypothesis of normal distribution.
Different countries have different "general rules" to my surprise. Also between Sweden and Finland are differences in the use of statistics. In Finland you do not use nonparametric statistics for big groups (n>30), in Sweden you can use such methods for very big groups (more than n=1000).
How strange indeed. I suggest you reading the following paper:
Fagerland MW (2012). t-tests, non-parametric tests, and large studies--a paradox of statistical practice? BMC Med Res Methodol. 12:78.
PubMed PMID: 22697476; PubMed Central PMCID: PMC3445820.
it is food for thought...
Hi all,
I have two inputs to make to this thread.
Firstly, about the normality assumption and its test. It is fundamental to point out that it is *errors* (i.e., the difference between the actual score of a statistical individual and the "true" score for that statistical individual as predicted by the hypotheses at hand) that are supposed to be normally distributed *conditional* on the levels of the explanatory variable (aka the independent variable), or, more generally, conditional on the linear combination of the levels of the explanatory variables. I have found that this condition of the general linear model is often not understood correctly, so in case of doubt please read my contribution to this thread:
https://www.researchgate.net/post/How_can_I_check_the_closeness_of_normal_distribution
Secondly, I am going to suggest far more powerful a method, based on Bayesian model comparisons, that is very simple (if you understand what planned comparisons are) and does not have the drawbacks of the Fisher-Neymar approach. You can carry out this analysis I suggest very easily in R with a GUI (graphical user interface) called R2STATS, which comes as a regular R library. It is extremely powerful (it is based on the lme4 package) and it is a click-and-drop interface (i.e., you won't have to write R code at any point). If you want to give it a try,
- first make sure you have installed the latest R version (if in doubt install it anew)
- copy-paste this command in R:
install.packages("R2STATS",dependencies=TRUE)
If the rats you've got are representative of the rats in general, then you may suppose that their performances, in the absence or a treatment (as well as under a treatment) will be different from a rat to another (within a given condition) but the different performances (for rats in the same condition) will vary following a normal law. So you may introduce in your analyses a random-effect, rat factor. The fact that it is a random-effect factor means that it has a known distribution, namely a normal one. This complicates a little the model but makes it more powerful (statistically speaking, it increases power) by including into the model (thus taking it out from the residuals of the model) variations within groups due to the fact that rats are different one from another.
Next step: the model comparison(s) step(s). First start by setting up a 5-level factor (say, Group5) that basically tells to which group each rat belongs. Your data file should look like this:
Group5 Rat AnxietyDV
healthyctrl rat1 15
healthyctrl rat2 14
healthyctrl rat3 10
healthyctrl rat4 13
healthyctrl rat5 13
healthyctrl rat6 16
diseasedctrl rat7 18
diseasedctrl rat8 15
diseasedctrl rat9 20
diseasedctrl rat10 18
diseasedctrl rat11 17
diseasedctrl rat12 19
drugA rat13 15
...
drugA rat18 15
drugB rat19 15
...
drugB rat24 15
drugC rat25 15
...
drugC rat30 15
Now some words on the logic of model comparison. The nice thing about R2STATS is that it yields the BIC (Bayesian information criterion) associated with each model so if you test all possible models (which is easy to do given your design) you know then which model is the most probably true (i.e., the "good" one) for your data (the lower the BIC, the better the model, so the "best" model, if and only if you test them all, is the one with the lower BIC). So the aim of the analyses I present below is to end up with the best model for your data, without using the theoretical hypotheses you have on the treatments at any point. It is in my opinion the most fair way to go.
Try a first mixed-effect model:
M1: AnxietyDV ~ (1|Rat) + Group5
Under the "Results" tab in R2STATS you'll get some output, ignore it for now and go look, under the "Graphs" tab, at the 5 means. For the sake of the example, let's suppose the means for the conditions healthyctrl and diseasedctrl are very close, and apart from the means of the other conditions (I'll make no suppositions about the 3 other conditions' means now). This would make you want to try a model that supposes healthyctrl and diseasedctrl conditions do not differ (while continuing to suppose that the 3 other means differ both from healthyctrl&diseasedctrl and one from another).
To test this, build a new variable (say, Group4). To do that, under the "data" tab click on "recode & transform", chose from the dropbox Group5, then write in the editable blank space: healthyctrl, diseasedctrl=ctrl. Make sure you change the variable name in the "Store under" editable blank space from whatever it is to Group4 (!! failing to to so will result in overwriting thus losing one of your variables !!), then clic on "Execute".
This yields a new variable so under the "data" tab your date now looks like this :
Group5 Rat AnxietyDV Group4
healthyctrl rat1 15 ctrl
healthyctrl rat2 14 ctrl
healthyctrl rat3 10 ctrl
healthyctrl rat4 13 ctrl
healthyctrl rat5 13 ctrl
healthyctrl rat6 16 ctrl
diseasedctrl rat7 18 ctrl
diseasedctrl rat8 15 ctrl
diseasedctrl rat9 20 ctrl
diseasedctrl rat10 18 ctrl
diseasedctrl rat11 17 ctrl
diseasedctrl rat12 19 ctrl
drugA rat13 15 drugA
...
drugA rat18 15 drugA
drugB rat19 15 drugB
...
drugB rat24 15 drugB
drugC rat25 15 drugC
...
drugC rat30 15 drugC
Go back to the "model" tab and try a new model:
M2: AnxietyDV ~ (1|Rat) + Group4
Again, under the "Results" tab in R2STATS you'll get some output, ignore it for now and go to the "comparisons" tab, click on "select all" and then click on "compare". As a result, M1 and M2 are compared, and the result is displayed (the GUI goes itself to the results tab so you see the results).
Now, 2 things may happen. Either M2 is a better model for your data than M1, either it is not. How to know that and what does it mean ? To know whether M2 is a better model for your data than M1, simply compare the BIC value of each model, the model with the lowest BIC is the best). Let's suppose BIC value for M2 is lower than BIC value for M1: this means M2 is a better model, and since the assumption M2 makes and M1 does not make is that healthyctrl and diseasedctrl conditions do not differ, this means that indeed healthyctrl and diseasedctrl conditions do not differ and one could just as well speak of just control (without differentiating between healthy and deseased rats). Let's suppose BIC value for M2 is *higher* than BIC value for M1: this means M2 is *not* a better model (so M1 is the better of the two), and since the assumption M1 makes and M2 does not make is that healthyctrl and diseasedctrl conditions *do differ*, this means that indeed healthyctrl and diseasedctrl conditions *do differ*.
Then you proceed along these lines, trying to put together different conditions and see if you get a better model by doing that than by supposing more different groups. At the end you'll get the best model, and you interpret it.
Here is an example (fictitious, since I have no idea what your actual data would yield):
M1: AnxietyDV ~ (1|Rat) + Group5
M2: AnxietyDV ~ (1|Rat) + Group4
let's suppose BIC value of M2 is lower than BIC value of M1
this temporarily makes M2 the better model
this model says that there is no difference between healthy and diseased controls
now you look of the graph of M2 and it seems that with drug C the anxiety is lower than with drugs A or B (intermediary anxiety), and that there is not much of a difference in anxiety between the conditions drugA and drug B ; as to the ctrl condition, it is associated with still higher anxiety than the conditions drugA and drugB
you may want to consider a model that puts together conditions drugA and drugB (into drugAB), lets drugC be different of drugAB, and both drugAB and drugC different from ctrl
you build a new variable (say, Group3) starting from Group4. To do that, under the "data" tab click on "recode & transform", chose from the dropbox Group4, then write in the editable blank space: drugA, drugB=drugAB. Make sure you change the variable name in the "Store under" editable blank space from whatever it is to Group3 (!! failing to to so will result in overwriting thus losing one of your variables !!), then clic on "Execute".
This yields a new variable so under the "data" tab your date now looks like this :
Group5 Rat AnxietyDV Group4 Group3
healthyctrl rat1 15 ctrl ctrl
healthyctrl rat2 14 ctrl ctrl
healthyctrl rat3 10 ctrl ctrl
healthyctrl rat4 13 ctrl ctrl
healthyctrl rat5 13 ctrl ctrl
healthyctrl rat6 16 ctrl ctrl
diseasedctrl rat7 18 ctrl ctrl
diseasedctrl rat8 15 ctrl ctrl
diseasedctrl rat9 20 ctrl ctrl
diseasedctrl rat10 18 ctrl ctrl
diseasedctrl rat11 17 ctrl ctrl
diseasedctrl rat12 19 ctrl ctrl
drugA rat13 15 drugA drugAB
...
drugA rat18 15 drugA drugAB
drugB rat19 15 drugB drugAB
...
drugB rat24 15 drugB drugAB
drugC rat25 15 drugC drugC
...
drugC rat30 15 drugC drugC
under the "model" tab write the new model:
M3: AnxietyDV ~ (1|Rat) + Group3
you then compare M3 and your best model so far, M2
let's suppose BIC value of M3 is lower than BIC value of M2
this temporarily makes M3 the better model
this model tells you that drugA and drugB are not different as to their effect on anxiety in rats
under the "Graphs" tab, have a look at the 3 means of M3. Let's suppose the means for the conditions drugAB and ctrl are pretty close close, and apart from the means of the drugC conditions
let's try to group conditions drugAB and ctrl together and see if we get a better model
for that, you build a new variable (say, Group2) starting from Group3. To do that, under the "data" tab click on "recode & transform", chose from the dropbox Group3, then write in the editable blank space: drugAB, ctrl=drugABctrl. Make sure you change the variable name in the "Store under" editable blank space from whatever it is to Group2 (!! failing to to so will result in overwriting thus losing one of your variables !!), then clic on "Execute".
This yields a new variable so under the "data" tab your date now looks like this :
Group5 Rat AnxietyDV Group4 Group3 Group2
healthyctrl rat1 15 ctrl ctrl drugABctrl
healthyctrl rat2 14 ctrl ctrl drugABctrl
healthyctrl rat3 10 ctrl ctrl drugABctrl
healthyctrl rat4 13 ctrl ctrl drugABctrl
healthyctrl rat5 13 ctrl ctrl drugABctrl
healthyctrl rat6 16 ctrl ctrl drugABctrl
diseasedctrl rat7 18 ctrl ctrl drugABctrl
diseasedctrl rat8 15 ctrl ctrl drugABctrl
diseasedctrl rat9 20 ctrl ctrl drugABctrl
diseasedctrl rat10 18 ctrl ctrl drugABctrl
diseasedctrl rat11 17 ctrl ctrl drugABctrl
diseasedctrl rat12 19 ctrl ctrl drugABctrl
drugA rat13 15 drugA drugAB drugABctrl
...
drugA rat18 15 drugA drugAB drugABctrl
drugB rat19 15 drugB drugAB drugABctrl
...
drugB rat24 15 drugB drugAB drugABctrl
drugC rat25 15 drugC drugC drugC
...
drugC rat30 15 drugC drugC drugC
under the "model" tab write the new model:
M4: AnxietyDV ~ (1|Rat) + Group2
you then compare M4 and your best model so far, M3
let's suppose BIC value of M3 is lower than BIC value of M4
this means M4 is not better than M3, so M3 is the best model this far
in other words, drugAB (i.e., drugA or drugB, since they are associated with anxiety values that are not different, see M3 above) are associated with anxiety values that are different from the anxiety values observed in ctrl (i.e., healthyctrl or diseasedctrl, since they are associated with anxiety values that are not different, see M2 above) rats
there is a last question left now: is drugC indeed more effective than drugAB?
M3 suposes so, but we have not compared M3 to a model that supposes that drugC is equally effective as drugAB
To do that, under the "data" tab click on "recode & transform", chose from the dropbox Group3 (the variable in M3), then write in the editable blank space: drugAB, drugC=drugABC. Make sure you change the variable name in the "Store under" editable blank space from whatever it is to GroupABCvsCtrl (!! failing to to so will result in overwriting thus losing one of your variables !!), then clic on "Execute".
This yields a new variable so under the "data" tab your date now looks like this (variable Group4 not shown) :
Group5 Rat AnxietyDV Group3 Group2 GroupABCvsCtrl
healthyctrl rat1 15 ctrl drugABctrl ctrl
healthyctrl rat2 14 ctrl drugABctrl ctrl
healthyctrl rat3 10 ctrl drugABctrl ctrl
healthyctrl rat4 13 ctrl drugABctrl ctrl
healthyctrl rat5 13 ctrl drugABctrl ctrl
healthyctrl rat6 16 ctrl drugABctrl ctrl
diseasedctrl rat7 18 ctrl drugABctrl ctrl
diseasedctrl rat8 15 ctrl drugABctrl ctrl
diseasedctrl rat9 20 ctrl drugABctrl ctrl
diseasedctrl rat10 18 ctrl drugABctrl ctrl
diseasedctrl rat11 17 ctrl drugABctrl ctrl
diseasedctrl rat12 19 ctrl drugABctrl ctrl
drugA rat13 15 drugAB drugABctrl drugABC
...
drugA rat18 15 drugAB drugABctrl drugABC
drugB rat19 15 drugAB drugABctrl drugABC
...
drugB rat24 15 drugAB drugABctrl
drugC rat25 15 drugC drugC drugABC
...
drugC rat30 15 drugC drugC drugABC
under the "model" tab write the new model:
M5: AnxietyDV ~ (1|Rat) + GroupABCvsCtrl
you then compare M5 and your best model so far, M3
let's suppose BIC value of M3 is lower than BIC value of M5
this means M5 is not a better model than M3, i.e., it is not OK to suppose drugC is equally effective as drugAB
in the end we stick with M3
It is important to check under M3, in the results tab, that:
- the p-value associated with the normality test (Shapiro-Wilk's) is *higher* than 0.05
- the p-value associated with the homogeneity of the variances test (Levene's) is *higher* than 0.05
Is any of these two conditions are not met, one should not believe in the model (since assumptions of the statistical method used would not have been met)
If these checks are OK, then
M3 says (with respect to the DV anxiety):
- healthy controls do not differ from diseased controls (so let's call them ctrl)
- drug A do not differ from drug B (so let's call them drugAB)
- drugAB differs from no drug (i.e., ctrl)
- drug C differs from drugAB
To sum up, we have:
- no effect of the health status on anxiety (healthy controls do not differ from diseased controls)
- different effects of drugs: all drugs are affective (i.e., different from taking no drugs), but drug C is more effective than drug A and than drug B, with the latter two equally effective
Let me know if you have some questions/comments.
HTH,
cheers,
SCM
What do you suggest if the required conditions are not met?
Béatrice,
Before suggesting a solution, one needs to understand the problem (I guess one can track this back to me being a psychologist, but is still so true speaking of data analysis).
The "normality conditional to levels of the IV" condition may fail to be met because:
- an important IV was left out of the analysis. Solution: think again and include the IV you have left out
- a distribution such that, because (for instance) of few observations, does not look very much like a normal distribution. Solution: data transformation (any monotonic function may be applied; which one you must apply is more of an expertise thing...)
The "homogeneity of the variances" condition may fail to be met because:
- of bad luck ;) By this I mean too few observations per group, causing, by mere chance, a much higher variance in a group than in other(s). Solution: collect some more data in each group.
- of a systematic reason; it is very important to differentiate this reason from the previous one. If you have a reasonable amount of statistical individuals in each group, look at the variance in each group. If the variance in one group is much smaller, think: is there a theoretical reason for this? If yes, then you ought have made an hypothesis on the variances, not on the means! I give an example of this. Suppose you want to know whether a new teaching method is better that the traditional one used in schools, and you make the hypothesis that kids' reading score measured at the end of the school year will be better with the new teaching method than with the traditional one. When you make such an hypothesis you think of comparing the means in a group model approach that supposes all other parameters of the distributions of the scores in the two to-be-compared groups are the same. Because the group model approach supposes that the distributions of the scores in the two to-be-compared groups are both normal distributions, the others parameters is in fact the other parameter (without "s"), the variance. That is to say the variance in one group is supposed to be equal to the variance in the other group. One need to test this assumption before going any further (i.e., test for the difference in means), and it is this assumption that is tested by a homogeneity of the variances test (e.g., Levene's). Now suppose the homogeneity of the variances test is significant (i.e., p
Yes, thank you for taking the time to elaborate this. Wish you a good continuation of the weekend:)
Hello every one, i have match group design (4 groups). while talking about the matching of 2 groups chi square test is used for matching on few demographic variables for which we need matching.. but i am confuse about four groups.. shall i apply the same test (chi square) or another test?? kindly help??
I suggest you use two way ANOVA with repeated measures.
Also read the Latin Square Design.
Good luck.
OneWay ANOVA – Similar to a t test, except that this test can be used to compare the means from THREE OR MORE groups (t tests can only compare TWO groups at a time, and for statistical reasons it is generally considered “illegal” to use t tests over and over again on different groups from a single experiment).
TwoWay ANOVA – it allows you to compare the means of TWO OR MORE groups in response to TWO DIFFERENT INDEPENDENT VARIABLES. With this test available, you can set up an experiment in which each member of your sample is exposed to a varying level of two different treatments In a field study, this test allows you to compare a mean Response Variable relative to two different environmental conditions.
Try using two way ANOVA with mixed model in open source 'R-package'