OLD QUESTION:
With paired and unpaired I do not mean dependent (experiments done on the same day) vs. independent (experiments done on different days).
For example:
I cultivated one cell line with substance A, B alone and with the combination of A and B. I repeated this experiment several times on independent days and weeks. I measured the viability of the substance groups in % of the control. Now I want to compare A, B and the combination and want to test for difference. To keep it simple I want to use a nonparametric test.
My statistical adviser is not completely sure if my samples are paired or unpaired. She tends to that cell culture experiments are in general paired. As her colleagues do not show a consistent opinion if cell culture experiments are paired or unpaired she advises me to use the test which the majority uses.
Please help me! What do you use for similar cell culture experiments? If you need more details please ask me. Thank you in advance!
UPDATED QUESTION AND DISCRIBTION (09.05.2014):
1) Experimental settings:
I did cell culture incubation experiments for about a week in a 96well plate. The experiment settings were as follows:
Cells of one stock flask were spitted in 24 Wells of the same 96well plate.
6 Wells were incubated with substance A
6 Wells were incubated with substance B
6 Wells were incubated with substance A+B
6 Wells were incubated with solvent (control)
Several Blanks (no cells but substance A / B / A+B / solvent)
The chosen concentration for A and B was the concentration which would kill 50% of my cells (IC50). The IC50 was determined by concentration effect curves ahead of the combination experiments.
The medium of all wells were changed once during the experiment. As the substances were unstable, the medium was mixed with new substances each time the experiment was started or the medium was changed.
After the week the experiment was stopped and the end viability of the cells was measured with the help of an assay and a fluorometer. The fluorometer measured the viability of the cells in relative fluorescence units. This was one experiment.
This experiment was repeated 6 times during different weeks. I did not reuse cells. Cells which had never been used for experiments were used for the next experiment. Thus the experiments are independent from each other as I believe.
2) Basic analysis of the data:
I got 6 experiments with 6 replicates per experiment and treatment group.
For each experiment I firstly subtracted the blanks (wells without cells but with substance A, B,...) from each data point. Then I averaged the solvent control and determined the substance groups in % of this averaged control. Afterwards I averaged every test group. Thus I only got one data point per treatment group.
This analysis I repeated for all 6 experiments so that I got 6 data points for substance A, B, A+B and solvent control. My table of data looks similar to this:
Solvent control: 100%, 100%,...., 100% median: 100%
Substance A: 51%, 48%,.... 65% median: 50%
Substance B: 47%, 56%,.... 50% median: 50%
Substance A+B: 25%, 26%.... 21% median: 24%
3) Questions concerning the statistical analysis:
I am not sure how to analyses theses data statistically.
A) Is it not okay to determine the viability in percent of control? Shall I not average the replicates
B) Do I have paired or unpaired samples? Are my samples paired because I used only one cell line? Are my samples paired because the cells during one experiment are influenced by e.g. the same basic medium charge, same passage number, temperature, well plate,… Are these influences during each experiment so important that the samples of all 6 experiments are paired?
C) Shall I test for difference between A and A+B and B and A+B? If yes, shall I write afterwards sth like the following?
The combined treatment is statistically different to the single substances. As the median of the combined treatment is smaller than both single treatments, there cannot be a complete antagonism of the single effects of A and B when they are combined. However, more experiments are needed to decide if there is a slight antagonism, addition or a synergism of the single effect when the substances are combined. (I do not know if you know Webbs formula) Interestingly, the viability of the combined treatment is nearly the product of the viability of the single treatments. (0,5*0,5=0,25) Thought of testing for difference between the calculated combined treatment and the in reality measured viability of the combined treatment.
D) Shall I do a two-way ANOVA, a Friedman two way ANOVA,...? HELP!
I hope my descriptions are clearer now. If there are questions left, please go ahead and ask me. Thank you very much in advance!
Generally I would use One-way ANOVA to calculate the variances in the groups (made of replicated measurements) and post-hoc Tuckey's test to compare all the groups in pairs (eg, Ctrl vs A, Ctrl vs B, Ctrl vs AB, A vs B, A vs AB, B vs AB)
If I wanted to compare those grups with controls only (eg Ctrl vs A, Ctrl vs B and Ctrl vs AB) I would use Wilcoxon signed rank test (a kind of t-test) and calculated all the pairs separately.
Generally your conditions seem to generate "pairs", but for one-way ANOVA such information is not necessary
I hope it will help.
Generally I would use One-way ANOVA to calculate the variances in the groups (made of replicated measurements) and post-hoc Tuckey's test to compare all the groups in pairs (eg, Ctrl vs A, Ctrl vs B, Ctrl vs AB, A vs B, A vs AB, B vs AB)
If I wanted to compare those grups with controls only (eg Ctrl vs A, Ctrl vs B and Ctrl vs AB) I would use Wilcoxon signed rank test (a kind of t-test) and calculated all the pairs separately.
Generally your conditions seem to generate "pairs", but for one-way ANOVA such information is not necessary
I hope it will help.
Completely agree with Dr Kamil, as i did same statistical analysis to all my groups (vivo and vitro). You can use SPSS 20.0 very helpful and easy and containing all above mentioned tests. TQ
I forgot to mention, I used GraphPad Prism. It also contains all needed tests.
Dear Sandra,
I would say use unpaired t-test for each group. Because to see effect of A, B and AB in cells you have taken different cell group. If you see the effect (viability) of A, B and AB on same cell then it will be paired otherwise will be unpaired. Since, here you mentioned that you want to see the effect of A, B and A+B , with this test you are comparing only one parameter , not more parameter in a group like different type of cells and different time period etc. Thus, unpaired t-test will be appropriate....
Also, you can read and calculate more about t-test in given website:
http://www.graphpad.com/quickcalcs/ttest1.cfm
Good luck!
I am with Kamil. You could also use a t-test, mind however that this is not recommended for dose responses when comparing to non exposed controls. If you decide for t-tests, this is paired if you started from the same starter culture (which should be the case). Paired will also be more stringent then unpaired. Graphpad is a good choice and indeed have a good look at the manual, it comes close to a course in statistics.
Thank you for your answers!
Do I not need Gaussian variables for the one-way ANOVA as for t-tests?
As I am not sure that my variables are normal I wanted to use a non-parametric test to keep the determination simple.
I also have SPSS. Thought of using Mann-Widney U if my samples are unpaired or Wilcoxon or Sign if they are paired. Furthermore, I though of a Bonferroni correction of the p value when I compare A with A+B and B with A+B.
At the moment I see 2 for paired and 1 for unpaired, isn't it?
@Navin: Yes, the substances effect different cells but one cell line. The cells were incubated parallel in 96Well plates with either solvent of A and B or substance A and solvent of B or substance B and solvent of A or the commination of A and B. This experiment was repeated several times. The treatment groups were afterwards calculated in % of the solvent control. Nevertheless, you think the treatment groups are completely unpaired. The are not dependent on the experiment day and conditions of this day. Yes?
@Dominic: Missed your answer while writing my new answer.
That makes 2 unpaired and 3 paired.
I would rather agree with you Dominic that my treatment groups are unpaired as I indeed used different wells for solvent control, A, B, and the combination.
But I am not sure because as here some people also said to me my experiments where paired because cells were incubated at the same day under the same general conditions and I used the same cell line.
Dominic you would not use the Mann-Widney U test instead of the Kruskal-Wallis test, would you? My n is really small (6 respectively 7). Is it nevertheless okay to use the Krustal-Wallis test? Do you know what the difference between Mann-Widney and Kruskal-Wallis is? What requirements does the Krustal-Wallis test have?
Thank you in advance!
Do I understand you aright: You would use Krustal-Wallis test if I am not sure if my data is normal and has variance homogeneity. You spoke of post tests after the one-way ANNOVA. Which post test would you use after the Krustal-Wallis test?
My aim was to investigate if effects of the single substances are not effected when the substances are combined. As I learned some weeks ago the drug combination index of Chou would have been perfect for my experiments. However I am not able to repeat my experiments to fit his requirements.
So I thought of testing for difference between substance A and the combination and substance B and the combination. Furthermore I wanted to discuss: As the median/mean effect of the combination is not weaker than the effect of one of the single substances there cannot be a complete antagonism of A and B when combined. To decide if there is a weak antagonism, an addition of the effects or a synergism I wanted to advise more experiments.
Dominic or others: Do you think the Krustal-Wallis or the ANNOVA with post tests would be helpful in this question setting?
Furthermore, what I can indeed see is that the effect of A+B is nearly the same when I multiply the measured single effect of A with the effect of B. I think Webb postulated 1963 that there is addition if this is true. But as concentration effect curves are normally sigmoidal in cell culture I only wanted to mention this but did not want to take it as proof for addition.
What do you think? Thank you in advance!
Hi Julie, the same cells (eg primary cells from the same patient) divided into two groups, untreated and treated form a pair. You can use paired t-test to calculate if there's any difference between them after treatment. Of course for proper t-test you need at least 6 pairs of results (6 replicates) to be able to call the result "significant". GraphPad for instance will tell you what number of replicates is enough.
Unpaired tests are used for expample when you are looking for a gene expression in maligant cells and healthy controls (although you are looking for the same gene, samples are not from the same source and do not form pairs). Unpaired test is obvious when the number of the samples in each group is different (eg you have 100 tumor samples and 20 healthy controls).
Regards,
Kamil
@Julie:
Sorry that I confused you. I recognized that my old post was quite confusing. So I corrected it and will post a new version of it (see below).
You are right I had 6 replicates per experiment and substance group but I also repeated the whole experiment several times during several weeks and month. So I also have independent samples not only replicates in my experimental run.
With "I cannot repeat it" I mean as our project group does no loner exist I do not have lab assess, the money and time to start all over again with the experiments. That unluckily means I have to use the results I have.
I hope now my explanations are better. Thank you for the whole title and showing me how confusing the old post was!
"Perhaps you could look to see if the confidence intervals of the IC50 values overlap?"
How does that help me with my test of difference between A, respectively B and A+B or my question "is there antagonism when A and B are combined"? Sorry do not understand your suggestion.
To summarize the results again: 3 for unpaired and 3 for paired.
I would go for unpaired as it naturally comes into my mind. But like Julie my cells derivate from the same cell line and are seeded into one 96Well plate from the same stock flask for one experiment of a bigger experimental run.
So does it make the bigger experimental run unpaired because I repeated the one experiment several times at independent days and weeks? Or is it still paired because I used the same basic cell line? Or is it still paired because the control and the substance groups of one experiment have more in common than all the experiments of the run?
@Thomas and others: What would you advise me as I cannot use Chou's index but also cannot restart my experiments (no lab, no money) ?
By the way the use of Chou's index would give me a headache because the concentration effect curves were done ahead of the combination experiments to determine the IC50 and they were not as often repeated as the combination experiments. I would like to mention Chous's index nevertheless in the discussions but would not use it in the results.
What do you think about my idea of testing for difference between substance A, respectively B and the combination and saying something like the following:
As the median/mean effect of the combination is not weaker than the effect of one of the single substances there cannot be a complete antagonism of A and B when combined. To decide if there is a weak antagonism, an addition of the effects or a synergism further experiments are needed.
Furthermore, the effect of the combined substances is about the same as the product of the measured single effect of A and the measured single effect of B. I think Webb postulated 1963 that there is addition if this is true. But as concentration effect curves are normally sigmoidal in cell cultures I only wanted to mention this but do not want to present it as a clear proof for addition.
What do you think. Thank you in advance.
Hi Sandra,
I believe the statistical tests dependes on the your statistica hypothesis. from what I understood you want to see if there are any difference between Control, A, B and A+B. Now assuming that everytime you did experiments you had all 4 conditions and you used same cells on same day. so basically you are testing different compound on the same subject, that means your samples are paired.
now when you repeated the experiment, let say, after a week, with all 4 conditions on same cell. that is a different subject (subject 2) but still you are testing different compound on the same cell on that perticular day. and still want to see if there is any difference between these contions in subject 2.
Your six repelicate, each has all 4 condtions, and each conditions were treated similarly on that perticular day. gives you 6 subjects. (N=6), and you still interested to look for a difference between treatment (not the subjects), so according to me your samples are paired. you should do Repeated measure ANOVA.
However, if you did different treatment on different days you should stick with unpair test.
Hope this will help and hope not to confuse you.
If one group of cells receive only one treatment you can use unpaired t-test. However if the same group of cell received two different treament (like patient taking placebo and later a different treatment), you should select paired -t-test.
In both case you will have one average for each group and compair them.
I hope that clarification will help you.
Dear Thomas,
According to me, we should pair the conditions, not the cells within one conditions done of different days.
so
day1) Control, A, B, A+B is a repeated measure (as they are basically same cells in same environment, on same day)
day2) Control, A, B, A+B is anothera repeated measure
and so on
here is a link to stat page. that explain how we can use repeated mesure anova for
1) changes in mean scores over three or more time points, or (2) differences in mean scores under three or more different conditions.
https://statistics.laerd.com/statistical-guides/repeated-measures-anova-statistical-guide.php
as per this guide Don't we fall under 2nd example?? where conditions/treatment are an independent variable ? correct me if I am wrong.
I do think that cell culture conditions varies between different days. so we should not treat control cells done on day 1 equals to day 5. they are similar but not exact same.
Dear Sandra - am sorry that you are in this position. It would be helpful to discuss this face to face - is there someone at Bochum that you could show the data? However, based on the information you provided and then re-posted, it would be appropriate to regard the data as UNPAIRED, and to subject it to a repeated measures ANOVA with a Tukey. Hope this helps, good luck.
If I am not mistaken we now have 3 paired VS 5 unpaired. However, I am not completely sure if Hardik was also for unpaired. If yes than it would be 3 VS 6.
@Thomas:
The publication is great. Then I can write in my thesis I did it like.... Thank you!
However, for me it would be enough to show that the effect of A+B are additive in the combination. Can I use this approach never the less?
I am a little bit irritated because the paper speaks about a two way analysis by ANOVA but everyone here speaks about a one way ANOVA. So what is better for my data a one way or two way ANOVA?
@Hardik
Yes, I repeated my experiments several times. However, every time the experiment included different cell culture wells for control, substance A, substance B and the combined treatment. Never one well got more than one treatment. Do I understand you aright? If I only had one day you would say I have a paired experiment but as I repeated it independently several times you think I have a unpaired experimental run.
@Avijit:
Yes, I have a statistical advisor but she normally analyses patient studies.
As she has not analysed cell culture experiments before she advised me to also asked other biologist. That is why I am here asking for your opinion. Yes, it is not easy! But now I have more ideas to discuss with my advisor. Thanks to everyone of you!
Nevertheless, I am still interested in your suggestions and will keep you up-dated how I finally decide to analyse my data.
Hi Sandra, I am actually for paired comparison, I think you should do One way Repeated measure ANOVA.
Sorry for the confusion, I just wanted to make sure that all conditions were treated similarly, on the same day. and you did. So I will vote for Repeated Measure One way ANOVA.
Also on the Two ANOVA: as Dominic described It is for 2 variables, but from what I understand you have only one variable, that is "Treatment". if you have done 3 treatments on 3 different cells lines you should do Two way ANOVA. So again, for this situation my vote goes to Repeated Measure One way ANOVA.
Hope I did not create any more confusion.
Good luck with your analysis.
Thomas, I thought for an additive affect AB has to be 30% because 0,6*0,5=0,3. Am I wrong? I followed this assumption from Webb's (1963) more complex formula.
Do you know if you can do a two-way ANOVA with SPSS? There is lots of information about one-way ANOVA in the internet but hardly any about two-way ANOVA and SPSS.
I found under "General linear model" "Univariate...", "Multivariate", "Repeated Measures" and "Variance Components". Do you know if "Multivariate" would be the right choice for a two-way ANOVA? Thank you in advance!
I thought I needed two columns. One column with the Viability Data (e.g. 100, 99,98 / 60, 61, 63 / 50, 51, 52 / 10, 12, 13) and a second column with the treatment (e.g. 1, 1, 1 / 2, 2, 2 / 3, 3, 3 / 4, 4, 4). The Viability Data is dependent and the treatment is the fixed factor. Am I right?
I am confused because if I run the test and the post test with my in reality measured data like this even A VS B is significant. Is that possible or did I get the basics wrong? Thx
I have no idea. Sorry! I wanted to do a two way ANOVA and used your link as assistants. Did I get the basics wrong by creating only two columns? (One for the viability data and on for the treatment groups)
Aaaa... okay! I will try it right away and let you know what results I get. Thx
No sorry I only have excel and SPSS. Do not know if there is a campus license for graphpad prism. Be back when I have the new results!
Thx. So far I can tell the post hoc tests were not performed because 0 and 1 in substance group A and B are not enough for a post hoc test. SPSS told me at least 3 different subgroups were needed. :( What do you think?
Okay but where can I see antagonism, addition or synergism between A and B in the combined treatment?
Do You mean in the table "Tests of between-Subjects Effects" A*B?
Corrected Model)__ df=3___F=179____Sig=0.000
Intercept)________df=1___F=325_____Sig=0.000
A)______________df=1___F=139____Sig=0.000
B)______________df=1___F=381____Sig=0.000
A*B)____________df=1___F=18_____Sig=0.000
Error)___________df=20
Total)___________df=24
Corrected Total___df=23
Or do you mean data in the tables "Estimated Marginal Means"?
I tried but didn't understand everything. I saw that F= is very important. But didn't understand the following:
"[...] the F statistic for the interaction effect is the right way 4.79, with 1 and 12 degrees of freedom critical value of F for a=0.05 and 1 and12 degrees of freedom, which is 4.75. The computed value of 4.79 exceeds this critical value and so it is concluded that there is a statistically significant positive interaction effect [P
Thx for the text. If it's okay I'll read it tomorrow because unfortunately I'm not very receptive any longer. I might understand it easier after a good night sleep. Hope you have time to answer further new appearing questions tomorrow. Thx for your extensive help concerning my statistical problems.
Sorry I'm really tired. Will reread your posts tomorrow and will read the new paper. Hopefully, I will understand afterwards how I can decide with the help of the ANOVA if there is antagonism or addition. If I still don't understand it I hope you have time for some more questions tomorrow! Goodnight!
I am back. If I understand the papers aright they decide on synergism, addition or antagonism by judging the profile plots, don't they?
Still I do not understand how they calculated the critical value of 4,75 in the first paper.
"[...] the F statistic for the interaction effect is the right way 4.79, with 1 and 12 degrees of freedom critical value of F for a=0.05 and 1 and12 degrees of freedom, which is 4.75. The computed value of 4.79 exceeds this critical value and so it is concluded that there is a statistically significant positive interaction effect. [...] "
These were actually my real data: (My real data are in % of the control)
Corrected Model)__ df=3___F=179____Sig=0.000
Intercept)________df=1____F=325____Sig=0.000
A)______________df=1____F=139____Sig=0.000
B)______________df=1____F=381____Sig=0.000
A*B)____________df=1____F=18_____Sig=0.000
Error)___________df=20
Total)___________df=24
Corrected Total___df=23
When I look at the blot (see below) I would say antagonism but I really don't understand everything you said. I hope you can help me.
I have already used Webb's formula for my data and got:
67.7% measured inhibition of viability antagonism
And by using Chou's formula ignoring his requirements of doing the concentration effect curves simultaneously with the combination experiments, I get an index of 0,847 (= moderate synergism).
A) With paired and unpaired experiments I did not mean dependent (experiments done on the same day) and independent (experiments done on different days).
B) To Thomas' answers and questions:
Actually, I have no idea if these results are biologically plausible. I think this substances were not combined before my experiments. We hoped for at least addition. However, working with data in % of the control does not effect the analysis, does it?
So the talk about 4,75 and 4,79 is not important for me, respectively my results, is it? I do not need something like that for my results, do I?
Concerning Chou: I did concentration effect curves as he wants. However, these experiments were done to determine the IC50 for the combination experiments. I did the concentration effect curves because of this ahead of the combination experiments with the IC50 of A, B.
C)
I am really not sure how to present the two-way ANOVA results. Would be a description like this be okay?
(1) I showed with the two-way ANOVA that A, B and their combination had a significant effect on the viability of the cell line. (2) However, the interaction effect blot showed that there is a slight antagonism as the lines converge slightly. This agrees with the results of Webb's formula.
However:
Could the results of (1) not have been obtained by e.g. Witney-U-test between control and A, B and AB?
Would my first idea not be better? Or can I combine all these tests and information to a better description of my data? I try it afterward:
[BTW my first idea was to test for difference between substance A, respectively B and the combination. I also wanted to test for difference between AB and the calculated product of the single effect of A and B (-> Webb). ]
The two-way ANOVA showed that A, B and their combination had a significant effect on the viability of the cell line. E.g. Witney-U tests showed that A, B and the combined treatment are different. As the median/mean effect of the combination is not weaker than the effect of one of the single substances there cannot be a complete antagonism of A and B when combined. However, the interaction effect blot showed that there is a slight antagonism as the lines converge slightly. Following Webb's formula also get a slight antagonism. However, this formula was developed for hyperbolic concentration effect curves which are not found in cell culture experiments. So this results should seen as a possible sign for a slight antagonism. To show clearly if there is a weak antagonism or after all an addition of the effects further experiments are needed.
D) Requirements for a two-way ANOVA
But can I do a two-way ANOVA with my data? Requirements have to be fulfilled for it. I did a Kolmogorov-Smirnow-test for control, A, B and AB (n=6) separately like this website does http://www.univie.ac.at/ksa/elearning/cp/quantitative/quantitative-62.html.
My extreme differences were all smaller than 0.519 and the asymptotic significance were about 0,2 and thus bigger than 0,05. I hope that is enough evidence for normality of my data. Or do you think I better do a Shapiro-Wilk-test? I am not sure how I sort my data for this test.
The same for the Levene’s test I am not sure if I should test control, A, B and AB separately or all together.
Sorry for the really long post. Thx in advance!
@Thomas
Thx for your answer.
A) But if the data in % of control are not normally distributed then I cannot use the ANOVA. Normality and homogeneity are required if I am not mistaken.
I like to send you my data but I am unsure. What compensation do you want for this extra service? Naturally, I would like to thank you in the acknowledgement of my PhD theses but I cannot promise much more as my boss decides on who will be co-author.
I think you should use a Student t-test for paired data or a Wilcoxon test for non parametric data.
I hope it will help
https://statistics.laerd.com/spss-tutorials/friedman-test-using-spss-statistics.php
I am not sure if you can do a Friedman two way ANOVA with SPSS. Does the link describe what you recommended for my data? Or is this the one way model? Thx in advance.
Will send the data to you tomorrow. Do you need further information about my data then I mentioned here? Thank you very much for your offered help.
Hi.
if you treat the same cell line at different time points with different substances than you should use ANOVA for repeated measurement or t tests ffor paired samples for their analysis.
If you treat different or independent cell lines at different time points with different substances than you should conduct t test for unpaired samples or ANOVA for independent measurements for their analysis..
I wish all of you a nice weekend !
Best regards,
Andrea
Identify your sources of variation. The reason you performed the experiment on multiple days is to control for technical error and estimate variability within the cell line. The only statistical inferences you will make about the effects of A or B are in the context of this cell line, alone. You don't really care if the day of treatment affects the response of cells, so why would you test for that? The effect of day gets moved to your error term in your Kruskal-Wallis test. In that sense, the fact that it is a clonal cell line is irrelevant - of course replicates are inherently non-independent but only compared to primary cells. Like i said, your inferences are restricted to this cell line anyway. I think a practical approach would be to perform a two-way test including day of addition to convince yourself that there is no confounding effect of day (and that you have good pipetting skills), but it is unnecessary to include a two way ANOVA in a publication even if there is significant variation due to day. It's not relevant to the question and it's a functionally unimportant to control. Of course, statistical significance does not equal biological significance, so the synergistic or antagonistic effects of A+B are up to your interpretation.
Without knowing all the details on how the experiment was layout and carried out, my first answer would be to treat the variable "time" as a blocking variable and then proceed to analyze the results as a randomized complete block design (assuming that you run all the treatments every day). If the blocking variable (day) turns out to be significant, then you would have successfully remove the variability arising from this nuisance factor.
If you decide not to take into account the blocking variable in you analysis, all the variability arising from potential day-to-day differences will be included in the error term of your model, increasing its variance and possible making some real treatment effects undetectable (not significant).
Sandra,
My advise on the matter at hand is going to be a bit technical but keep reading! :)
Here is what I teach to my students, hope this will help.
The first thing to do is to identify the kind of dependent variable you have. Many people skip this step and doing so leads to choosing wrong statistical methods to process experimental data and make sense of it.
By this I mean: i) is it numeric ? (from what I gather from the presentation you made of your experimental situation, the answer to this question is yes); ii) is it continuous or discrete ? (e.g., the number of cell that survive is a discrete measure; time to death is a continuous measure, even if in the end, for measurement sake it is discretized into milliseconds, for instance) ; iii) does it have a left bound, does it have a right bound, or neither ? (e.g., "number of cell that survive" has a left bound and no right bound; the same applies to "time to death", but there are many other dependent variables that may have different bounds or no bounds at all).
The statistical distribution you will suppose for / impose on your dependent variable depends on your answer to these questions. Let me elaborate on this:
- dependent variable with left bound and no right bound : your dependent variable will most probably follow a gamma law distribution ; this can be well approximated by the lognormal law, i.e. you first transform your dependent variable by taking the logarithm of it and then you run your analyses on your newly created dependent variable considering that it will follow a normal law (this is where the "ANOVA" kicks in, but, as you will see if you keep reading, I will advise you against the blind use of ANOVA, following an argument similar to that of Noel Artiles-Leon)
- dependent variable with both a left and a right bound : it will follow a beta distribution. This is not something easy to work with, but you can try the betareg library under R. If you express your dependent variable in the experimental conditions as a % of what you get in the control condition, you will have a dependent variable with both a left and a right bound. Thus beta regression. I strongly advise not to express your dependent variable in the experimental conditions as a % of what you get in the control condition. More on that bellow.
- dependent variable with no bound (neither left bound nor right bound): if the distribution of your dependent variable is normal within each condition (i.e., both in control, in A, in B and in A and B ; you can test this in just one step with a Shapiro-Wilks test), then your dependent variable "has" a normal distribution and Gaussian regression methods are adapted methods to analyze your data.
Now on the type of analysis that you ought to consider. I think that at some point or another you may want to use Gaussian regression methods (or general(ized) linear models), of which one of the most commonly (but often wrongly) used is the ANOVA. These methods require that each measure be independent of another, that the measures follow a normal distribution conditional to the levels of the independent variable(s) you consider. This is *not* the case of your data, because you collect data on the same statistical subject on various occasions (the different days). The correct analysis of your data is through mixed-effect models.
I do not understand clearly what is your dependent variable, but except that, here is the model you may want to "put" on your data (let DV be your dependent variable) :
DV ~ (1|Petri_dish)+Days+Treatment, or
DV ~ (1|statistical_subject)+Days+Treatment, or
By "Petri_dish" (or, if you like that better, "statistical_subject") I mean each and every individual subject of your experiment, whatever the condition it is in.
The notation is that of the lmer library, see here :
http://cran.r-project.org/web/packages/lme4/lme4.pdf
If you are interested only in the "overall" effect of the Treatment (A, B, A and B vs. control), then the model I propose is a good starting point.
In a way related to what Noel Artiles-Leon explains above, this mixed-effect model has the advantage of modelling the effect of Days, so this source of variation is "taken out" of the error term, leaving room for an experimental effect to emerge; this is a method superior to that of say, averaging over the Days. Also, the random effect of Petri_dish "takes out" (or rather it models) the variations in "performance" between the different statistical individuals of a same group.
If you are interested in the dynamics of evolution (in time, i.e., over the Days) of your statistical_subject as a function of the Treatment, you may want to consider growth models, a special case of mixed-effect models. Here is such a model :
DV ~ (1+Days|Petri_dish)+Days+Treatment
and another one, which supposes a quadratic growth :
DV ~ (1+Days+Days²|Petri_dish)+Days+Treatment
(with Days² = Days_squared, i.e., the variable Days to the second power)
HTH,
Serban
Sorry I am a little bit lost. The days does not interest me.
On one side I have the cytotoxic substances A, B and the combined treatment and on the other side the viability due to the used substances. I repeated the experiment 6 times and each experiment day has 6 replicates. I averaged the replicates for each experiment day and determined the viability afterwards as % of control. So I have at the end only 6 values for each substance group. One for each experiment day.
I want to determine if A and B act additive when they are combined or if they antagonise each other. That is my main aim.
Would you not determine the data as % of control but rather keep it as relative fluorescent unities? Would you also use a Friedman two way ANOVA to check what is true for my data? Do you want to pool my data? Thx in advance!
Sorry, Sandra, I am quite lost myself, I am definitely not a biology guy. I do understand some biology, so I get such things as cytotoxic and the like, but I fail to understand the following :
- "I repeated the experiment 6 times" : does this mean that you have 6 individuals in each of the 4 Treatment conditions (control, A, B, A and B)? Or, if not, what does it mean?
- "and each experiment day has 6 replicates" : again, what does this mean?
- "I averaged the replicates for each experiment day and determined the viability afterwards as % of control." What are you averaging? (I need to understand that in order to help) and Why would you average over the individuals? (I mean, no one individual would react to the a treatment level in exact the same way as another individual within the same treatment level; averaging is thus a poor option, because you could instead model the effect of being a particular individual as a random effect and thus take it away from the error term, thus highly increasing the power of your analyses). As to measuring things as % of control, see my previous comment, AND you can only do this if you average (which you should not).
I do understand your aim, but I do not understand your experiment and your independent variables. Below I will ask you some specifics.
"Would you not determine the data as % of control but rather keep it as relative fluorescent unities?" Yes, keep the dependent data in whatever (raw) unit you have it for each individual. Definitely.
"Would you also use a Friedman two way ANOVA to check what is true for my data? Do you want to pool my data? " No pooling, definitely. Thus no ANOVA, of any kind. Pooling is not only a poor statistical decision given the data at hand but it is also a philosophical flaw (it will lead to conclusions on a level that is different from that of the measurement level).
OK, now time for moving on. I do need to understand your experiment, so here is how I understand it. Please correct where necessary and provide details where you see that I am wrong in what I suppose/ I understand.
So I think you have 6 individuals per Treatment level (6 Petri dishes with some kind of "beast" that are given A, 6 Petri dishes with the same kind of "beast" that are given B, 6 Petri dishes with the same kind of "beast" that are given A and B, 6 Petri dishes with the same kind of "beast" that are given their standard food ; "beasts", or whatever, but I have to make this more concrete). Am I correct so far ? If no, what is that you did and how does it differ from "my story"?
You give to all statistical individuals (i.e., 6*4=24 Petri dishes with the same kind of "beast") what they get according to the Treatment assignment (i.e., you do not change what a Petri dishes is given to "eat" from a day to another), once a day, for 6 days in a row. Is this so? Or do you put into your Petri dishes more than once a day, for 6 days? (If so, how many times a day ? Always at the same time ? These could be complementary variables)
Your dependent variable is the amount (or concentration of) "beast" still alive in each Petri dish. And you measure this on all measure occasions. Please correct here if necessary.
Looking forward for your answer.
Cheers,
SCM
Sandra, of course that the "days" do not interest you! ... That is why is a nuisance factor that can be considered as a blocking factor. It should be clear that just because a variable (like time) does not interest you, that does not mean that it is not affecting your responses.
I would not recommend taking the average of the 6 responses of the same treatment within each day because then you lose information about the variability among replicates within each day.
Again, without knowing much about how the experiment was carried out there is not much that I can recommend... Were substances A and B prepared at the beginning of the experiment and the same batch used every day? If that is the case, it is possible that the effect of these substances changes with time? Were the susbtances prepared every day? If that is the case how do you account for the variabilitity between days due to the preparation? In the treatment combination AB, did you use the same proportion of A and B every day? ...The statistical analysis must correspond to the experiment design that you had, and it seems that the design is not clear to me from this discussion.
Noel,
From what I gathered so far from Sandra's explanations (still fuzzy yet, to say it so), I think that her main interest is in the "final" outcome, i.e., whether the different Treatment levels have different effects on survival.
This is the "simple" question she asks.
BUT, I still think that this "simple" question may reveal itself to be too simple, or simplistic (if I may). By this I mean that the final survival rate maybe is and maybe is not all the "truth" that her data can tell. Sometimes, truth is in the details. Growth models are capable of revealing those details (if I understood well her experiment).
Of course, a simple mixed-effect model (as the first one I propose) would be a huge improvement over an ANOVA on pooled data, and in that I much agree with you.
Cheers,
SCM
Sorry for confusing everybody. This was my test design:
I did cell culture experiments. At first I determined the concentration of the substances which would kill 50% of my cells. After that I incubated X cells about a week with either substance A, B, A+B or solvent (control). I used the concentration of A and B which I had determined before. For A+B I used the same concentration of A and B as in the single treatment. I used a 96Well plate. Per experiment I had 6 wells with cells incubated with substance A, 6 wells incubated with substance B,...
The medium of all wells were changed once during the experiment. As the substances were unstable the medium was mixed with new substances each time the experiment was started or the medium was changed. After the week the experiment was stopped and the viability measured with the help of an assay and a fluorometer. The fluorometer measured the viability of the cells in relative fluorescence units. This was one experiment.
I repeated this experiment with the same basic settings 6 times but during different weeks. I did not reuse the cells. Cells which had never been used for an experiment before were used for the next experiment. Thus the experiments are independent from each other as I believe.
I know that time, temperature, moon phases, passage number of the cell line, medium and substance charges, ..... influences the test results.
Yes, only the viability at the end of the week interests me and were measured. It was not possible to measure in between as the setting has to be sterile and the conditions in the fluorometer are not sterile.
To analyse my data. I subtracted the blanks (wells without cells but with substance A, B,...), averaged the solvent control afterwards and determined the substance groups in % of this averaged control. Afterwards I averaged every test group so that I would get sth. like substance A = 50% viability to control, substance B = 50% viability in % of control, substance A+B = 25% viability of control, control = 100% viability in % of control. This I repeated for every of the 6 experiments so that I would get 6 data points for substance A, B, A+B and solvent control. My table of data are similar to that:
Solvent control: 100%, 100%,...., 100% median: 100%
Substance A: 51%, 48%,.... 65% median: 50%
Substance B: 47%, 56%,.... 50% median: 50%
Substance A+B: 25%, 26%.... 21% median: 24%
Now that I have this data, I am not sure how to analyses them statistically.
Shall I test for difference between A and A+B and B and A+B? What tests shall I use one for a pared or an unpaired samples?
If I test for difference shall I write afterwards sth like that: The combined treatment is statistically different to the single substances. As the median of the combined treatment is smaller than the single treatment there cannot be a complete antagonism of the single effects. However more experiments are needed to decide if there is a slight antagonism, addition or a synergism of the single effect when the substances are combined. (I do not know if you know Webbs formula) Interestingly, the viability of the combined treatment is nearly the product of the viability of the single treatments. (0,5*0,5=0,25) Thought of testing for difference between the calculated combined treatment and the in reality measured viability of the combined treatment.
Or shall I do a one-way or two-way ANOVA, a Friedman two way ANOVA,...? HELP!
I hope my descriptions are clearer now. If there are questions left please go ahead and aske me. Thank you very much in advance!
@Noel:
Thank you for your mail with the example. Yes, you understood me. Only the viability differs in my experiment. The control is per definition 100% and the concentration of substance A and B were chosen so that the viability of both are nearly 50% in comparison to control. With dummy data the table looks more like this:
________________Substance A_________Substance B_____Viability Control______________0__________________0____________100%
Substance A_________1__________________ 0____________ 50%
Substance B ________ 0 __________________1____________50%
Substance A+B_______1__________________1 ____________24%
Did you do a two-way or a one-way ANOVA in your mail?
If I am not wrong I think data in % of control is never normally distributed and that is why I cannot use a normal ANOVA. Would it be better to return to relative fluorescence units? I think for a figure and for the description is % of control easier to understand.
What do you think? What would y advise me?
Sandra, I sent you a message about an hour ago; I think the analysis that I show you in that message is what you may want to do.
Hi,
As you have repeated this experiment several times on independent days and weeks, you samples are not paired and you should do unpaired two sample T-test.
Thank you! Just have seen it and answered it in the post before your last one. My questions to you are:
1. Did you do a one- or a two-way ANOVA?
2. What about the missing normal distribution? I was told data in % of control was not normally distributed.
3. But with your statistical analysis I cannot answer the question "Are the single effects of the substances antagonised in the combined treatment?" and "If not is there an addition or synergism of the single effects?"
What is your advice here?
Thank you in advance!
I had a look once more at you mail and your example.
You did pool my data and did not average the data per week?
I mean you used 36 data points per substance group and not 6 data points?
Is that okay?
I was told you can only analyse independent experiments. With independent and dependent I mean data which was obtained at the same time point is dependent and data obtained at different time points especially at different days/weeks/... are independent.
Sandra,
1.) I did not pool (average) any data; the measurements were entered into the statistical package as shown in the table in the example (120 rows corresponding to 4 treatment combinations x 6 replicates x 5 days; in you case you have 6 days, so you will have 144 data points).
2.) We also may have some communication problems because we are in different fields and use the same terms to signify different things; in "my book":
a) You have done ONE experiment
b) That has TWO Factors (or independent variables): substance A and B.
c) Each of these two factors is set at two levels (present or not present)
d) When both factors are set to "not present", you call that experimental condition your "control"
e) We think of independent measurements (I believe you have 144 of them), not of independent experiments.
f) For your description of the experiment, I have no reason to believe that one measurement might somehow affect the other. Even measurements done within the same day. Consequently, I may assume that the measurements are independent. This assumption can be (and should be) corroborated with an analysis of residuals of your final model.
g) Because - I believe - measurements within the same day are going to have less variability than measurements between days, my advise is to introduce the variable "day" as a blocking variable. This is a precaution; if in the analysis "day" is not significant, then you can drop it from the model.
3) If you include the interaction term in the ANOVA, you can, not only answer the questions "Are the single effects of the substances antagonized in the combined treatment?" and "If not is there an addition or synergism of the single effects?" but you can also quantify these effects. I did that in the example that I sent you. We were able to state: "With a confidence level of 95%, when Substances A and B are combined (in whatever fixed proportion you decided) the response decreases by between 1.99 and 3.43 units (with respect to the control – no substances A or B used)" ... so, with the dummy data, combining the substances had an antagonistic effect (assuming that you want your response to be as large as possible).
Finally, a disclaimer: I am not a biologist and I have no idea of (a) what you are measuring or (b) how your measuring it or (c) what are "fluorescence units" :-)
Sorry I am quite tired so that might be the reason that I am quite lost.
When I only look at your example data I really do not see any real difference between the control data and the data of the substance groups. Everything is nearly about 50-55. I think that is why I have problems to understand the following.
When I will have analysed my data like your example I do not understand where I will see that A does not or does antagonises the single effect of B in the combined treatment. And is it possible with your analyse to determine if the single effects are added or synergised in the combined treatment. How is this shown in the output?
Send you some more questions by mail. Thx
@Noel:
Wrote you via Research Gate 3 days ago. Cannot find your mail address any longer. No time today. Will answer you tomorrow, respectively send you the questions again.
Generally, if you compare all the groups, you should use One-way ANOVA.
If you compared any two groups, you should use T-test and in your experiment you should use unpaired test. Your samples are not paired. If you have one group sample, and you want know the effect of your substance. The sample before treatment record as C group and the sample after treatment record as E group. In this situation, you want compare the C group and E group, you should use paired test. because your groups are paired, they are dependent one group sample. But, in your experiment, your groups are independent, they are not paired. So you should use unpaired test.
I think your problem can be broken down to the question if one series of experiments (A, B, AB) was performed simultaneously and shared a common factor that might explain parts of the variance (e.g. if you used the same rack and racks differ from each other with respect to the concentration of pre-incubated substances). If so, a paired test should be performed. If not, you can go with an unpaired test.
However, this might not be the primary problem of your analysis. Your data are normalized to a control value. That does not necessarily mean, that they do not fulfil the assumptions of parametric tests. However, you have to check. Have a look at the distribution of the residuals in your data and look out for heteroskedasticity. If you think that simple parametric methods like t-tests and ANOVAs are not appropriate, it might be helpful to use mixed models in your case, but you should see a statistician for that.
It depends on how different the cell lines are. If the only variable is the treatment then you use the paired test, if you have additional variables then you really don't have paired testing.
As I understood Sandra's question, it is only one cell line. In my experience, working with cell lines, we use to consider that experiment is paired for passage, because it normally cames from the same culture suspension which was splited. So, passage factor is common for all the plates/bottles plated from that particular cell suspension.
HI! you can used all you want, but the question is what do you want know about of you experiments? If you want know if your system is paired and unpaired is to very easy: applied t-student,Tuckey test. If you want know what is the behavior of you cell line you need to used an experimental design more complete. I can help you only please send me the problematic that you want to know, for as I can plan the experiments.
Sorry, I think you misunderstood me. The experiments are finished. Now I want to analyse the data. I have already described the experimental design that was used several times. Please read all my posts in which I hopefully clearly describe the design, my problems and questions concerning the analysis of my data. If there are still questions left, please go ahead and ask me. Thank you in advance.
As I understood Your material was one single cell line that you took samples from to generate groups.
The problem is that we do not fully understand the methodology described in the first post.
I think what You did was to perform following (correct me if this is not the case):
You took a cell line and started 4 cultures (from now on independent of each other)
culture I: cell culture+substance A
culture II: cell culture + substance B
culture III: cell culture + substance A and B
culture IV: cell culture without substance (control)
Now (this is crucial for the answer):
1.You compared concentration of the substance in:
1.1 culture I vs. concentration of substance in culture IV?,
1.2 culture II vs culture IV?
1.3 culture III vs culture IV?
and so on?.....
If this is what you did than your results are NOT PAIRED!!!
2. But if you performed a comparison of culture I (day1) and culture I (day 2 or later the first day or in any other time) than what you did was paired.:
2.2 concentration of substance in culture I (hour 0) vs concentration after 12 hours in culture I (hour 12)
Hope this helps :)
I updated the initial question as I feel that it wasn't clear enough. Furthermore, during the on-going discussion some more questions appeared. Hope now is everything clearer.
@Wojciech Francuzik: I think my experiments are like your first one. Sorry don't really understand your last example.
Hi,
In your case, generally, I first check the distribution of variables for normality (normality test). Since you just have one cell line I would try one way ANOVA, to see the effects different variable, then I would apply post hoc for multiple comparisons of means such as Least-squares means post hoc for multiple comparisons of means (LSMEANS statement with Bonferroni correction). The output would give you all the possible comparisons and you can choose what you need.
As the discussion goes on lot of opinions surface, I was following and found that there is still considerable difference in the opinion about paired and unpaired. As per the logic of statistical test paired stands only for two variable set if they are exactly same and used for two separate treatment (as rightly mentioned by Dr. Chong and Dr. Karch); whereas in your case the cells are although from same cell line but each well of the culture plate has different set of cells, so they are definitely not paired.
regarding the normality assumption, one need to examine the data and test it for normality by suitable test like K-S test or A-D test available in software packages.
If normality assumptions satisfied it is always better to go for parametric test like One way ANOVA. (two way does not arise in this case).
There is also no apparent problem with use of average for the data obtained. For determining synergism or antagonism one need to apply from different standard formula available.