Statistical significance is a technical term with a precise mathematical definition: it is p = Pr( |t| > |tobs| | M,H0), in words: the probability under a given statistical model (M) with given restrictions (H0) of a test statistic (t) being "more extreme" than the "observed" test statistic (tobs).
Historically, scientists have learned call an expreimental finding "(statistically) significant" if this probability was lower than 5% (usually): p "significant". They have forgotten that statistical significance is a continuous spectrum and not a yes/no-thingy. They ask (and answer) the faulty question "is my data significant?" instead of the more sensible question "how significant is my data?". Also, they preferred to use the shortcut "significant" for "statistically significant", what makes it difficult to see when the technical term is meant and when the usualm, colloquial meaning of the word "significant" (=relevant, important, impactful) is meant.
This common practice lead to the convention that only "(statistically) significant" results could be published, and researchers aimed to produce such "significant" results. I think this in turn developed the (faulty!) understanding that only significant results were important or relevant, and non-significant results were unimportant or irrelevant, further aggravating the confusion.
Since many decades statisticians scream that this interpretation is very dubious and misguiding. "statistical significance" does not mean the same like the colloquially used word "significance". To make particularily clear that one is NOT talking about statistical significance some people used the term practical significance.
Statistical significance is a technical term with a precise mathematical definition: it is p = Pr( |t| > |tobs| | M,H0), in words: the probability under a given statistical model (M) with given restrictions (H0) of a test statistic (t) being "more extreme" than the "observed" test statistic (tobs).
Historically, scientists have learned call an expreimental finding "(statistically) significant" if this probability was lower than 5% (usually): p "significant". They have forgotten that statistical significance is a continuous spectrum and not a yes/no-thingy. They ask (and answer) the faulty question "is my data significant?" instead of the more sensible question "how significant is my data?". Also, they preferred to use the shortcut "significant" for "statistically significant", what makes it difficult to see when the technical term is meant and when the usualm, colloquial meaning of the word "significant" (=relevant, important, impactful) is meant.
This common practice lead to the convention that only "(statistically) significant" results could be published, and researchers aimed to produce such "significant" results. I think this in turn developed the (faulty!) understanding that only significant results were important or relevant, and non-significant results were unimportant or irrelevant, further aggravating the confusion.
Since many decades statisticians scream that this interpretation is very dubious and misguiding. "statistical significance" does not mean the same like the colloquially used word "significance". To make particularily clear that one is NOT talking about statistical significance some people used the term practical significance.
Most applications of statistical tests are to show some minimum was met. What that minimum should be is part of the objectives. The structure of the expected data and the effect sought might require a specific measurement. The specific measurement should indicate how the measurement is established. Propagation of error through the process will require a minimum above the error. When the minimum is reached there is the claim of statistical significance. Accepting the minimum is usually not sufficient. Statistically significant means you have a probability of accepting the result. If the statistical test was established in the objectives of the investigation, this probability might indicate that the result is in the direction of the objective. The result should be of sufficient quality to answer the question of the objectives in order to be of practical significance.
Hi Levan, This difference is always discussed amongst researchers. These are my views about this . I have made it simple .
Statistical significance refers to whether the observed effect is larger than we would expect by chance, i.e. can we reject the null hypothesis that there is no effect. This is what is typically addressed by p-values associated with T-tests or ANOVAs etc.
Practical significance is about whether we should care/whether the effect is useful in an applied context. An effect could be statistically significant, but that doesn't in itself mean that it's a good idea to spend money/time/resources into pursuing it in the real world. The truth is that in most situations , the null hypothesis is never true. Two groups will almost never be *exactly* the same if you were to test thousands or millions of people. That doesn't mean that every difference is interesting.This is usually associated with effect size measures (e.g. Cohen's d; which has criteria for 'small', 'medium' and 'large' effects), but generally will also need to take into account the context of the particular study (e.g. clinical research will have different expectations than personality psychology in terms of what kind of effects can be expected).
So previous responses have clearly made the distinction between statistical and practical significance. I just want to add the role of sample size. Let's say that you are interested in a political party and how it is recieved by the voters. A first, large poll has been made and your party is favored by 30% of the voters. Then there is a period of some political turbulence and a new poll is conducted. A sample of 10 voters are in the poll and 4 of them favor your party, i.e. the estimate for your party is 40%. Clearly you would consider this "change" of practical importance, but you also understand that it is not statistically significant as the sample is really small. In Another poll that consists of millions of voters your party recieves 30.01%, which you deem of no practical importance. Yet it is statistically significant simply beacuse the sample size is enormous.
The virtue of the p-value and the practice of 5% significance is that any test can be turned in to statistically significant by using a larger sample/more experimental units as no simple null hypothesis is completely accurate.
The book Sense and Nonsense of Statistical Inference. Controversy, Misuse and Subtlety by C. Wang (1993) has much good stuff on all this. As to getting away from the 0.05 "rule", I like the approach taken by Drennan (2009) Statistics for Archaeologists: a commonsense approach, pp. 157--60, esp. Table 12.3.
The best book I have ever read on the topic is "The Cult of Statistical Significance" by Stephen Ziliak and Deirdre McCloskey. It tells the stories of Fisher, Gosset (Student), Pearson et al. and how they shaped our understanding and misunderstanding of what data tell us. If you Google their names, you will find a really nice PowerPoint presentation and a brief paper also on the same topic and with the same title. Besides the more elegant and eloquent explanations above, I often tell my undergrads that statistical significance helps us understand the likelihood that there is "an" effect (difference, relationship), while practical significance tells us whether the effect is big enough to care. Bottom line is we should simply ditch significance testing and report effect size estimates with confidence intervals, which answers both questions ("What is the effect?" rather than "Is there an effect?")
I don't have much to add to the discussion because it has been well said. I think effect sizes, especially in large samples, are more important than p vales. Below are a few resources that speak to the matter. The first is a paper by Cohen on the (mis)use of p values. The second a link to the new ASA statement on p values that starts with two interesting questions about why the are relied on so much. The third is a link to an article by Nate Silver. It is a easy-to-read and very basic summary. I joke with my students that if they stood outside the library on campus and surveyed people, everyone would know the magic p value (i.e., .05), even if that is all they know about stats. It is the one fact everyone remembers from their stats courses and also the one fact that potentially leads to the misunderstanding of the effects of an experiment.
Just to illustrate the two terms. Suppose you have a new medicine for hypertension and you have a study that compares with another old medicine. The old medicine brings the average pressure to 140 and the new medicine to 135 and since you had a large sample the result is statistically significant but is of no practical significance
I can sum up for you the difference between the statistical and practical significance in three points:
Statistical significance refers to the unlikelihood that the result is obtained by chance, i.e., probability of relationship between two variables exists. Practical significance refers to the relationship between the variables and the real world situation.
Statistical significance depends upon the sample size, practical significance depends upon external factors like cost, time, objective, etc.
Statistical significance does not guarantee practical significance, but to be practically significant, a data must be statistically significant.
Statistical significance has to do with the likelihood of obtaining a sample value at least as large (in absolute value) as the one actually obtained, assuming that the null hypothesis is true. It's a kind of conditional probability.
Practical significance has to do with whether a sample value matters from a clinical, policy, or whatever point of view. The distinction is made because it's possible for a finding to be statistically significant even if it doesn't matter much from a practical point of view.
For example, a study may find that some difference between treatment and control groups is statistically significant, even though, from a clinical point of view, that difference is so small that it's regarded as unimportant.
I totally agree with the responses of the four researchers / teachers who preceded me in this discussion group. In fact the, statistical, mathematical arguments and practical design statements, are complementary.
I emphasize that fact is not synonymous with "Meaningful Difference Statistics" that "Real or Practice Meaningful Difference", adjectives that are often used wrongly.
The expression "Difference Statistics Meaningful" is used when results are obtained as a result of an investigation and that the level of "Statistical Significance" [which has previously been established as alpha or probability level (p