Her is an important paper for the field of ischemic stroke, where over a thousand suggested targets and therapies have all (!) failed when further tested. They conclude a massive publication bias, i.e. negative data are not published leaving underpowered papers in the literature. This is on top of poor quality and lack of power calculations beforehand. What drives people to this? One cause are certainly the funding institutions and bureaucrats in universities who count peas, papers, and impact factors.
Well, that might be true. However, one has to look at this very important topic very differentiated. I agree that statistics usage is misleading on several levels ... with potentially hugh consequences ...
To answer your initial question: Yes, lots of people have taken this seriously - so far I haven´t understood why knowledge does not translate into appropriate behavior. One can only purpose about this:
Maybe, statistics are misused because fundamentally there are no satisfying solutions to problems that are currently "solved" with statistics ... And human - especially scientists - do not like to live with non-numerical uncertainties ...
Maybe it is because statistics is very difficult to discuss about and more so to discuss about the limitations of the method "statistics".
Maybe it is because our scientific communication system support false, but often published and cited methods over appropriately limited and criticized methods. Maybe, the way we cite research is misleading and appropriately limited and criticized methods become a "fact".
Most likely it is a combination of all of these factors ...
Good luck, it is indeed a very fascinating topic and digging into it can tell a lot about science, society and people ... :)
You might want to look at The cult of statistical significance by Ziliak and McCloskey. Hypothesis testing is only a theory and a very useful one but if wrongly applied or badly interpreted then conclusions can be misleading and even harmful.
The book by Ziliak and McCloskey is ok up to a point but overblown. Hypothesis testing is not the problem, it is a partial solution; the problem is that all sources of random variation are under-appreciated and under-estimated. This problem intrudes no matter what kind of inference procedures are used in the evaluation of evidence. Results that are striking enough to be published have been selected from a large pool of results (with or without a control for multiplicity within particular studies); part of what you see is simple regression to the mean. Another part of the problem is that studies are performed on people who satisfy criteria -- they are "supernormal" -- and then are generalized to a whole population that do not satisfy the criteria as rigidly, so that random variation is much larger.
What is lacking to date is a credible alternative to the way things are done now. An interesting book is "A comparison of the Bayesian and Frequentist approaches to estimation" by Francisco J. Samaniego. He makes the case that Bayesian approaches improve on Frequentist approaches only in the unlikely case that the prior distribution of the unknown parameter has its median close enough to the true value. A minor modification of his argument applies as well to testing whether a parameter is non-zero.
Nothing works better than replication, replication, replication. As far as I can tell from readings to date.
Soenke is correct: what kind of statistical errors do you mean? In any case, I'd suggest editors of scientific journals to require compulsory submission of the codes and data utilized for the experiment(s) performed. Referees should thus be able to replicate them. Guido.
Yes, that is true..Because, the statistical errors are not to be estimated in correct way....thats why, the conclusion may be wrong but not in all situations
Just to complement Mathew´s post, significance testing is important for scientific discussion but is only a tiny part of it (but not insignificant). Replication and retesting and looking at the same problem from different angles using different methodologies, all these things should be used before an alternative hypothesis is accepted. Comparing estimated values and standard errors is only a first step in a process that is several marathons long.
statistics is to be used by the people who know the theory but not by the people who are intersted in the click and use technique. It is a subject which is to be handled very carefully otherwise will lead to wrong results. Nothing is wrong with the subject but the mistake is with the user
i agree with Robert samohyl, alternative hypothesis is the complement of null hypothesis, so your testing is strictly to say either null is correct or alternative is significant and the null hypothesis restricts your testing.
I disagree... you do not find a true null hypothesis, only a highly probability of a correct null H0 or alternative H1. There are also a few different statistical methods that may be used for the same experiment or observation. I know very few researchers who really understand and can apply a causal inference analysis or truly know the limitations of ANOVA.
Jayalakshmy Parameswaran has it right and Matthew Marler makes an important point. For example, in the world of epidemiology, unique populations are selected, often with large non-response fractions, to make generalizations about the universe. Even if the researchers do not do that, the media do. Back to Parameswaran's argument, too many studies are conducted by would-be statisticians with no idea of what they are doing. Statistical analysis software packages are used. Small P-values or Confidence intevals that exclude the mystic 1.0 are taken as proof of whatever. Greatly improved analysis procedures that have been available for 2 to 5 or 6 years or so are not used simply because they are not available in user-friendly "packages". Soenke Bartling cites several good reasons for the bad conclusions. I would add to that the pressure to insure grant renewal, the publish-or-perish nonsense, the public clamor for the latest "cure", and sheer ego. I have seen studies, essentially identical in conduct, analysis and results, conducted by different experimentors, one who claims to have proven "everything", the other stating modestly "these are the results" and "here are the uncertaintaies". It would appear more reality and a degree of moral responsibility, especailly in public health matters, is desirable on the part of many "researchers". The peer review process is sorely lacking in this regard.
I very well agree with what John Fennick has complained. Recently in some paper i saw R2 negative. how it can be .Even plotting of data are with independent variable along the Y axis and dependent variable along X axis. For F statistic very high value of significance with decimal values for F statistic. All these are happening because the basic definition of these statistics are not understood. So what i feel is that finally the results are to be reviewed by an expert in statistics. I feel that the non statisticians are not interested in consulting an expert with the fear that there will be one more add on in the authorship list. what the statisticians doing are also to be scientifically acknowledged
Her is an important paper for the field of ischemic stroke, where over a thousand suggested targets and therapies have all (!) failed when further tested. They conclude a massive publication bias, i.e. negative data are not published leaving underpowered papers in the literature. This is on top of poor quality and lack of power calculations beforehand. What drives people to this? One cause are certainly the funding institutions and bureaucrats in universities who count peas, papers, and impact factors.
on the Oklahoma State University website or just through Google you can fine Grice's website, his selected papers on the subject and his blogs about the errors. Many statisticians also agree with Grice's conclusions as highlighted both in the popular press and scientific journals.
Frederic, his name is Dr. James Grice and here is his website with commentary:
http://psychology.okstate.edu/faculty/jgrice/
Offhand I am not sure which paper actual states the 70%, but I actually worked at OSU and spoke to his grad/undergrad students regarding that 70%. When I find time I will get the publication for you, but other researchers have made the same published statements; it is commonplace now to call into question correlation studies in scientific research. Several professors and grad students ar OSU agree with Grice's 70% or so figure. My apologies for not having exact publication quotes as I am extremely busy as of late.
Jan Rasmus Boehnke, I am glad that you provided a link to Ioannidis' work. He also published in New England Journal of Medicine and has been widely cited. I don't know why I didn't provide a link myself.
It may also be interesting to look at some work by Wicherts:
Bakker, M. & Wicherts, J. M. (2011). The (mis)reporting of statistical results in psychology journals. Behavior Research Methods, 43, 666-678.
Wicherts, J. M., Bakker, M. & Molenaar, D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS ONE
Wicherts, J. M. (2011). Psychology must learn a lesson from fraud case Nature, 480, 7
The publications can be downloaded from his website at the University of Amsterdam:
Having said that, I would also like to bring in a counter point (if only for the sake of argument).
Jayalakshmy Parameswaran states that statistics should only be applied by people who know the theory, so, statisticians. My personal experience with statisticians however, is that often they make implicit premises - because they do not really understand the data. As an applied researcher, I have ran into this problem quite often. I would accidentally find out something about the premises (e.g. balanced data) and then I ask about it and the statistician says: oh well, no if that is the case, technique X can not be used. No, sorry. The communication between statisticians and applied researchers is not always easy, which I think may be a partial reason why people just try to do it themselves (I don't think people really care about the extra authorship).
Restricting the field of statistics to be confined to only statisticians is not a good option. Many research are attempted with a trial and error basis, that is, learning by doing. Indeed, some good peer review is necessary ingredient. The other way, 'statistics only for statisticians' would do more harm than less good. if one gets the concept clear enough, one is free to apply the tools that provide hidden correlations, hypothesis testing and the OLS regression analysis which often is an invincible tool to analysis patterns, interpolation as also may applied in extrapolation. the problem is too many research adopting it as a short cut without going deep enough in theory and counterarguments. That may be avoided if peer review is good. so, i don't agree with Jayalaksmi in this regard. But about the conceptual paradigms, she is right.
S. Chaterjee is correct IMHO. However, the crux of the problem stands in peer reviewing, which all too often really stinks. Journals should be more careful when selecting judges and judging methods!
One cannot judge conclusions just by statistical errors. Those may have multiple causes, wrong sampling techniques or methods, irregularities in selection methods of independent variables, ambiguity in predictor variable raising confusion what should correlated to whom and then how much is such CORR? The strength of relationship between variables, reporting statistical errors, first of all misunderstanding the concept of the origin of such errors may also contribute to drawing wrong conclusion. The method of the experiments should be defined clear cut and steps followed, data samples applied, why such data samples are chosen, population group identification, variable definition, all things count. As Kim de Jong said above, one should reevaluate balanced data and then question selection methods to based on premises as implicit premises can run into trouble. Making sense out of statistics applied is more important than throwing stones in the dark targetting vaguely.
Without philosophizing, I can say I have been personally involved in such a disturbing episode. When measuring the electron contamination in a high energy photon beam with a strong magnetic field to de-couple the two I ended up with a conclusion that turned out not to be the case, pointed out by a kind colleague who happened to be working on the same thing at another well-known institution. My data and my numbers were correct, but my conclusion was faulty! Suffice it to say, my youthful vigor suffered a huge setback and I ended up eating a lot of humble pie to the point of throwing up! But I still see science as the only meaningful thing standing for humanity at the end of the day.