When I test whether my qPCR data are normally distributed with a Shapiro-Wilk test (using SPSS), I sometimes find that they do NOT follow a normal distribution. Do you have this problem? What do you do about it?
I rarely test for normality. The tests depend greatly on sample size. Also, it is usually the residuals in an analysis that you want to be normally distributed. If you have a large sample size, nothing is normal. If it is small, everything is. Not reliable to test for normalityh in my opinion.
I look for symmetry, not normality. qPCR is often log transformed prior to analysis to help with influential outliers or skewed data. Look at the histogram. Is the mean a good measure of centrality? If so, you won't be too misled using parametric tests. If the median is a better measure, use nonparametrics. If the data do not follow some sort of distributional pattern, rethink the steps that created them.
I rarely test for normality. The tests depend greatly on sample size. Also, it is usually the residuals in an analysis that you want to be normally distributed. If you have a large sample size, nothing is normal. If it is small, everything is. Not reliable to test for normalityh in my opinion.
I look for symmetry, not normality. qPCR is often log transformed prior to analysis to help with influential outliers or skewed data. Look at the histogram. Is the mean a good measure of centrality? If so, you won't be too misled using parametric tests. If the median is a better measure, use nonparametrics. If the data do not follow some sort of distributional pattern, rethink the steps that created them.
Additional comment : at which level do you do the test ? If at 5 % for instance, and you are doing several tests, it is normal some of them (5% )are significant just because you do many times the test, if you're data are perfectly normal (which they probably are not, but that's another debate)...
@Naveen: normality does not depend of the sample size, why should it? Your ability to detect deviation to normality does, however, as Warren May said.
Well, I kept the default which is at 5%. But I used ANOVA before realising that data were not all normally distributed. What do you suggest? What would be the best then?
When you mean data & ANOVA, do you mean normal distribution of the _data_ _AT A WHOLE_? or by group? Please note that for ANOVA, normal distribution and variance equality of the _errors_ are the condition, and should be check on the residuals, not on the raw data...
Emmanuel Curis, sorry for the late reply. I am not sure to understand what you mean when you ask me about data "at a whole" or "by group". I have different conditions (different time points or different treatment at the same time) and I have checked for normal distribution for each of them. I think I will use a one-sample Wilcoxon Signed Rank test (as my control is always equal to 1 and I can't access its distribution). Let me know what you think.
@ Virginie: What you did is what I meant by « by group »; it is a right way. Note however that you may have a better view by studying residuals at a whole (ideally, externally studentized residuals to be compared to a Student's law with the right df, but in practice it's probably not important and you can directly compare raw residuals to a Gaussian), less sensitive in particular to false-positive Shapiro-Wilk (or other normality) tests since you do only one for the whole setting, and not one for each group. That's probably not an issue however if your data are « evidently » non-Gaussian.
The equivalent of one-way ANOVA would be Kruskal-Wallis tests. Wilcoxon Signed Rank tests suggests a kind of pairing, which was not apparent in your ANOVA description (at least for me). Difficult so say more than that without more details on your experimental design.
I'm surprised that your control is always 1, it seems too reproducible... Are you using relative values and imposing control = 1 / 100 % / reference value?
Yes it is relative quantification imposing control = 1 = 100%. I can't know the variation of the control due to the way qPCR were run...
So to me it seems that I can only do one-sample tests, comparing all data points to the control (whether for normally distributed data with the one-sample t-test, using 1 as the mean to compare with, or for not normally distributed data with the one-sample Wilcoxon Signed Rank test using 1 as the median to compare with.)
Let me know what you think and thanks a lot for your help.
Long delay for the answer, sorry, I think RG forgot to tell me a message was waiting.
You should not, never, impose control = 100 %. You may use this for graphical representations purposes eventually, but not more.
There are several methods that can be used to compare the groups with the control, without imposing this, and it will depend on your experimental design, but in the most simple case the so-called DDCt method should do the job *if you have a paired design*.
For more complex cases, you may well consult a statistician for help.
I would also recommend to check the SARP.compo package for R and read