Are there significant differences in the different levels of significance?

There are several explanations for the difference between two levels of significance. First is simply by chance. Second is that your samples have different sizes. Third is that your samples have different noise. What I suggest is to perform a post-hoc power analysis to calculate the power of your tests. You may use Gpower (http://www.gpower.hhu.de) or R if you are proficient with this language. The power will show you if you had a large probability to reject the null hypothesis given the sample size, the variance (including noise), and the level of significance. If this probability is very small for p < 0.001 but large for p < 0.05, you may have obtained an extremely odd result in the first experiment. However, I must note that you should set the level of significance a priori, before the experiment, rather than a posteriori, once that you have obtained the p values. Thus, if you set the level of significance to 0.05, both tests agree. Other approaches are to joint both experiments (check for bimodal or multimodal data) or to perform a Bonferroni correction for multiple testing.

David Smith

Not much should be read into a p-value beyond the rejection (or failure) of the null hypothesis. What counts after that is the substance of the science and anything the results might tell about substance.

Bear in mind that a p-value tells you something about whether you just made a mistake in rejecting the null hypothesis. If the p is smaller then you are less likely to have incorrectly rejected the null.

Having said these standard things it might be worth examining the underlying statistical information of the tests. For a two-sample t-test these would be sample sizes, observed means and their differences and the observed standard errors, both separately and pooled. If some of these differ between experiments it might be a sign of stronger experimental control in one of the experiments. (It is worth examining these numbers anyway, regardless of the p-values.) The obvious place to look is for a difference in the sample sizes, of course.

Keep in mind that p-values are nonlinear in the test statistics, that is, relatively small differences in the numerator (means for a t) or the denominator (standard deviations) can yield apparently large shifts in the p-values. Also keep in mind that the nature of statistical variation is to cause variation in summary statistics and in p-values.

Is the difference of .05 and .001 in p-values interpretable, or even meaningful, on its own? No.

Is it worth asking how this happened? Yes.

Could this difference be due to chance alone? Yes.

Could this difference be due to small differences in the numerator or denominator or both? Yes.

Can such examination be useful in interpreting the results for scientific inference? Possibly.

Can such examination be useful in future research? Probably.

Regards,

David Smith

Jochen Wilhelm

@David:

"Bear in mind that a p-value tells you something about whether you just made a mistake in rejecting the null hypothesis. If the p is smaller then you are less likely to have incorrectly rejected the null."

No, that's just plain wrong. Please don't feed a common misconception.

Interestingly you seem to contradict this a bit later when you say that

"Is the difference of .05 and .001 in p-values interpretable, or even meaningful, on its own? No."

Well, if your first statement was right, then p=0.001 would indicate that in this case the probability of a wrong rejection was about 1/50 the probability of a wrong rejection when p=0.05. This would be some meaningful interpretation. But here you are right: that's complete nonsense, and it is nonsense because p-values do not tell you something about the probability of making a wrong rejection!

p=0.001 tells you that data (or the test statistic calculated from the data) more extreme that the one you observed is about 50 times less probable under the null hypothesis than it is for p=0.05. These are statements about the probability of data, and there is nothing said about rejections, mistakes, truths of hypothesis or anythink like that.

Joseph L Alvarez

You have chosen a particular statistical test. You have chosen to reject the null hypothesis if p

Stephen Politzer-Ahles

The only way to test whether two things are significantly different is to compare them directly. As Gelman (2006) notes, the difference between significant and not significant is not in of itself significant. The same applies to any other pair (e.g., the difference between "significant at α=.05" and "significant at α=.01" is not in of itself significant).

http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?