Most PhD work is littered with statistical markers (SD, ANOVA, p=, N=, two-tailed lesser-spotted thingamajig), but they often refer to simple relationships, that may be better understood (& evaluated) if they were written in plain language. Anything above basic statistics has a specific use (a bit like the need to speak Latin, programme in Python, perform extraordinary feats of mental maths etc.). So why all this emphasis on being a statistical genius & how many of us (beyond those with the job title 'Statistician') are genuinely conversant with the field ?
Is statistics used like a badge of 'cleverness' ? I only use Latin for established phrases, I use a calculator for clever maths, I prefer percentages and plain words to explain how variables relate to each other & I use SPSS to do the clever statistics thing (if I don't do percentages on a calculator). Despite what is claimed, not all fields are actually 'scientific' and I think that the output of our research should be designed for clarity and usability. In my experience, the over use of statistics does not promote this outside of genuine hard science.
Am I 'worthy' of being in the Ivory Tower?
Can anyone share articles which explore how the average person understands (or wants) heavy statistics and if plain language could demonstrate a point more clearly?
This is possibly also interesting for you:
Gerd Gigerenzer: Mindless statistics. The Journal of Socio-Economics 33 (2004) 587–606
Look at section 4 (p. 594-) and digest Fig.1 (p. 596).
To my opinion, a researcher should only apply tools (besides all measurement instruments and experimental details including mathematical and statistical tools) he/she really understands well enough to correctly interpret results and identify possible problems, limitations, to judge reliability, validity and generalizability. In many fields this can't be achieved by a single person, so we have to work in teams, with experts in different subjects, where all the experts need enough knowledge from everything else than their main expertise to effectively and successfully communicate with all other team members. I see a major problem that science becomes much too "economized", being degraded to a "career option", focussing on "publication output" and "funding". This seems to lead to an increase of data generation where most people do not anymore *really* understand what is going on. There is no time to learn a method thoroghly. Methods should just be applied, quickly, to generate data and publish highly significant results. I know PhD students who do perform (apply) real-time PCR who don't have any idea (or only a very, very vague idea) about the process, the underlying biochemistry, the principle of signal generation, the principle of the measurement and quantification, and the meaning of the read-out. So they almost know nothing about the method, and the don't know what the read-outs actually mean. How can one expect that they should be able to understand how the generated data should be reasonably analyzed? And they don't need to know (practically), since they will only re-do what they read in other papers, following "cookbook recipies" without understanding. The same I found for post docs. (I am not generalizing! I just say that such cases exist, although I have the feeling that such cases are not quite rare).
In statistics we have an even more severe problem. Statistics is (typically) tought in a non-sensical way, far too much focussing on hypothesis testing (as if this was the showcase field of statistics). And it is woefully neglected to teach the philosophy of (empirical) science, what knowledge means and information, how data is related to information and how this changes knowledge. Then a part of statistics is a mathematical/quantitative treatment of "representation of knowledge" and "information" and their inter-dpendence. Understanding this would put the focus of research from the utterly non-sensical question "it is significant?" more towards the more sensible question "what can we learn from the available data?".
Generally, I find it embarrassing when people publish theit t-tests and ANOVAs and whatsoever and can't reasonable answer why they decided to do a test at all, why they wanted to control what error-rate at what level (why the TWER or why the FWER?, why alpha=0.05? what actually is beta?) and thy only answers you get are
- "because others also did it",
- "it is convention",
- "the stats software does it for such data",
- "my boss/reviewer asked me to do this", and, eventually
- "I actually have no idea"
Sometimes you will happen to hear all the misconceptions about test and decisions, like
- "to proof that the null hypothesis is false / the alternative is true",
- "to demonstrate the reliability/validity of my findings",
- "ton show the relevance of my results" and so on.
No, it is *not* scientific to do tests and present p-values and other statistics. It is especially non-scientific to rely on automated decision-rules and focus on long-run error-rates instead of considering the individual case in its scientific context. It is scientific to understand what the data can tell us and to understand what we can learn from the data. This is 95% expertise and common sense and only 5% statistics, where the most important part of "statistics" is not doing some tests but to think about good summaries of the relevant information from the data, its visualization and smart exploration.
If you use a statistical test, then you need to understand its underlying logic (which, for multivariate methods usually entails understanding linear/matrix algebra, even more than calculus). SPSS, SAS, MATLAB, R, Sage, Statistica, etc., have provided ways in which primary school children can carry out advanced statistical analyses with no training in statistics but some instruction in programming or even just a set of instructions (button-pushing statistics; common enough in research already). However, a great many statistical methods do not require some advanced background in mathematics. The problem is that, because there are so many advanced methods out there which are both useful and easily used, statistics courses have increasingly become classes which teach one how to use software, not statistics. It is better to have a firm understanding of the algebra of simple statistics and modeling, combinatorics, and elementary probability than it is to use SPSS to run analyses one doesn't understand: "garbage in, garbage out".
There is a fantastic intro stats book that requires almost know mathematical background that I've recommended to grad students who have taken both intro stats and a multivariate stats grad course: Rand R. Wilcox's Basic Statistics: Understanding Conventional Methods and Modern Insights (also good and requiring no more mathematical foundations is his Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy)
Hi Andrew - thanks for getting the ball rolling.
I agree with your concept of knowing the underlying principles of statistics usage. You mention classes '...which teach one how to use software, not statistics.' I personally do not see this as a bad thing, as long as the person is clear what the statistical test means and how to apply it. As you say: "garbage in, garbage out", so teaching meaning is a good starting point.
I would argue though, that for many people it is not necessary to know the logic of workings behind the button, principles of modelling, combinatorics, and elementary probability. Especially in 'the professions': how many otherwise expert doctors, teachers, nurses etc. know (or need to know) this information?
You need to know if your tool is trustworthy and you are using it appropriately: You need to know when different basic functions are needed, how to use the tool which gives you the numbers & how to apply them. Importantly, you need to know how to translate the output into a meaningful conclusion / context that can be understood by an otherwise educated person who would benefit from the information. My point is that many articles prompted by PhD studies are littered with values, but are seldom supported by any text that explains what they mean in real terms. It is all very well claiming that something is 'statistically significant', but it does not always mean anything ... significant unless it is properly explained. For most people, the written word takes precedence over any specialist statistical language, so we need to 'translate' our findings into language that is * commonly * understood & in my field (education/healthcare) this is not always the case.
You can either say that the reader needs to become expert to understand what is being said, or the author needs to present their point more clearly so that it is more easily understood - either way, statisticians or statistical tools produce data, and it is our responsibility to interpret and apply the findings. My point is that lots of the investigations we carry out do not initially require an advanced amount of statistical analysis to make a valid or useful contribution. So why put so much emphasis on it ?
http://annals.org/article.aspx?articleid=1090696
http://www.ncbi.nlm.nih.gov/pubmed/9538712
http://www.nurseeducationtoday.com/article/S0260-6917(10)00243-1/abstract
http://www.sciencedirect.com/science/article/pii/S1053535704000927
http://www.sciencedirect.com/science/article/pii/S175115771200065X
Rather than offer some esoteric, erudite, pompous, and/or elitist response, let me share a bit of personal experience. I started graduate research at what I believe to be a fairly respected institute. In fact, there are few universities in the entire world more associated (rightly or wrongly) with erudition, academic integrity, and general quality scholarship than the institute I began my research and graduate studies in. Yet I saw the improper use of fairly basic statistics. I saw conclusions from studies which had no empirical defense because they had no quantitative basis. I tried to understand what was behind this clear violation of sound logic, mathematics, etc. I soon learned that nothing was behind it other than a lack of familiarity with the necessary quantitative procedures. It was very bright individuals violating basic (to me, and for those smarter than I and more familiar with mathematics far more basic) mathematical/statistical logic. And by doing so their results were utterly meaningless, and could be shown to be so. I tried to show this, but the inability of some colleagues to understand such explanations prevented me from being able to illustrate what I intended.
A simple question then: if you performed some statistical analysis with SPSS based upon your knowledge that the research question you had related in some sense with some statistical method, used it, and I objected and gave you reasons based upon linear algebra, multivariate mathematics, and/or advanced statistical theory, could you understand my objections enough to show me where I erred, where I might be right, or what the disconnect might be?
I need to emphasize that I do not mean to belittle or in any other sense/way insult the assertions you've made. I simply feel that an increasing use of advanced statistical software has accompanied an inability to properly use these thanks to current pedagogy.
This is possibly also interesting for you:
Gerd Gigerenzer: Mindless statistics. The Journal of Socio-Economics 33 (2004) 587–606
Look at section 4 (p. 594-) and digest Fig.1 (p. 596).
To my opinion, a researcher should only apply tools (besides all measurement instruments and experimental details including mathematical and statistical tools) he/she really understands well enough to correctly interpret results and identify possible problems, limitations, to judge reliability, validity and generalizability. In many fields this can't be achieved by a single person, so we have to work in teams, with experts in different subjects, where all the experts need enough knowledge from everything else than their main expertise to effectively and successfully communicate with all other team members. I see a major problem that science becomes much too "economized", being degraded to a "career option", focussing on "publication output" and "funding". This seems to lead to an increase of data generation where most people do not anymore *really* understand what is going on. There is no time to learn a method thoroghly. Methods should just be applied, quickly, to generate data and publish highly significant results. I know PhD students who do perform (apply) real-time PCR who don't have any idea (or only a very, very vague idea) about the process, the underlying biochemistry, the principle of signal generation, the principle of the measurement and quantification, and the meaning of the read-out. So they almost know nothing about the method, and the don't know what the read-outs actually mean. How can one expect that they should be able to understand how the generated data should be reasonably analyzed? And they don't need to know (practically), since they will only re-do what they read in other papers, following "cookbook recipies" without understanding. The same I found for post docs. (I am not generalizing! I just say that such cases exist, although I have the feeling that such cases are not quite rare).
In statistics we have an even more severe problem. Statistics is (typically) tought in a non-sensical way, far too much focussing on hypothesis testing (as if this was the showcase field of statistics). And it is woefully neglected to teach the philosophy of (empirical) science, what knowledge means and information, how data is related to information and how this changes knowledge. Then a part of statistics is a mathematical/quantitative treatment of "representation of knowledge" and "information" and their inter-dpendence. Understanding this would put the focus of research from the utterly non-sensical question "it is significant?" more towards the more sensible question "what can we learn from the available data?".
Generally, I find it embarrassing when people publish theit t-tests and ANOVAs and whatsoever and can't reasonable answer why they decided to do a test at all, why they wanted to control what error-rate at what level (why the TWER or why the FWER?, why alpha=0.05? what actually is beta?) and thy only answers you get are
- "because others also did it",
- "it is convention",
- "the stats software does it for such data",
- "my boss/reviewer asked me to do this", and, eventually
- "I actually have no idea"
Sometimes you will happen to hear all the misconceptions about test and decisions, like
- "to proof that the null hypothesis is false / the alternative is true",
- "to demonstrate the reliability/validity of my findings",
- "ton show the relevance of my results" and so on.
No, it is *not* scientific to do tests and present p-values and other statistics. It is especially non-scientific to rely on automated decision-rules and focus on long-run error-rates instead of considering the individual case in its scientific context. It is scientific to understand what the data can tell us and to understand what we can learn from the data. This is 95% expertise and common sense and only 5% statistics, where the most important part of "statistics" is not doing some tests but to think about good summaries of the relevant information from the data, its visualization and smart exploration.
Nicholas,
I fully agree with Andrew. World is not a simple structure. Mathematical models of reality using simplification and reduction neatly model structure of the reality which in fact is not reality in even mathematical sense it is simply structuring of reality and under-determination and uncertainty cannot be ruled out. Since, these models work via technology so they are taken as real. However, most event/phenomena in the world do not occur with same regularity rendering their optimisation by mathematical equations. mostly they are stochastic, here lies the importance of probability theory and statistics. If one understands well one problems and use statistics with UNDERSTANDING, it will not be a clever trick, results will be as reliable as those from mathematical analytical solutions, approximation of functions or from numerical analysis. After all, statistics is based on mathematical theory of probability. It is not that only social scientists alone use statistics, it is extensively used in hard natural sciences. I have posted similar question on the RG, of course, certain scholars sounded doubts like that of yours, but most agreed with its utility and significance:
https://www.researchgate.net/post/Why_Statistics_Matter_OR_Why_does_a_researcher_need_to_learn_statistics
Looking at the recent replies - I still think that we should use basic statistics to support most of our research, but have a fair grounding as to why we are doing so, how they are applied and importantly, what the results mean in terms of real-world relationships. I appreciate the need for expert statisticians who can support those of us who are not so gifted with statistical knowledge, but as I often say to authors, it is the argument and logic behind a piece of work that gets the message across, and not the (pseudo)complexity of the analysis. I admit that there are many who could brow-beat me with linear algebra, multivariate mathematics, and/or advanced statistical theory, but only if I choose to get into such a fight ... which I do not.
I think an important starting point is our audience - what do they need to see that will enable them to confirm, replicate or build on our ideas? I think that if I meet these needs, then I am using my tools (words, analysis etc.) appropriately. If I produce a raft of 'stats' that nobody needs or understands, then who am I writing for - them or me ?
Andrew/Jochen - I agree that if our field demands advanced approaches, then our knowledge has to be commensurate, but just because someone is dealing in a topic that can often be addressed by more simpler means, then it does not make them 'lesser' in any way. If they need more complex approaches, then there are always supportive and knowledgeable colleagues who can help them out .... teamwork is not a bad thing :-)
"Everything should be made as simple as possible, but not simpler." Albert Einstein
Nicholas: "but as I often say to authors, it is the argument and logic behind a piece of work that gets the message across, and not the (pseudo)complexity of the analysis." - well said! This hits the mark.
Unfortunately, here is a problem: "I think an important starting point is our audience" - the audience, regrettably, often expects such things. They are "trained" to watch out of "significant results" and uncritically accept conclusions based on insrutable but undoubtfully highly scientific and objective mathematics (find the sracsm ;)), whereas well-discussed findings based on effect estimates are not taken seriousely, since they seem to lack some objective proof ... somehow ... something mathematical... At least in my field of research.
"If I produce a raft of 'stats' that nobody needs or understands, then who am I writing for - them or me ?" - Hmm.... for both, always. For others to communicate your findings, and for you to get funding or a job (eventually and hopefully). The second part of motivation is usually considered more critical (how bad it is for science, though!), and in an environment where your papers are seen to have a higher "scientific quality" when it is full of things nobody understands, and full of other things the readers (reviewers/editors) expect from good papers, it seems to be the more promising way to simply put all these non-sensical analyses and statistics and sophisticated stuff in your paper.
Frequently students that I have supervised, seem to think that statistical tests come first, rather than being a source of guidance on how far we can stretch the inferences that we can make by "looking at the data" and derived summaries. They just describe effects as statistically significant or not. This results in very boring "results" sections lacking the information that the reader wants to know. When I read a paper I want to know the direction and size of an effect, what patterns are present in the data, and if there is a test, then statistical tests should help us decide what amount of precaution we need to use until additional evidence becomes available. Many students and experienced researchers which "worship" p-values and the use of strict risk levels ignore how powerful and important is the careful design of experiments, and how the frequently seen use of "approximate" randomization procedures or the approach of repeating an experiment until the results become significant invalidate the p-values they report.
[edited 5 min later] As I read again what I wrote it feels off-topic, but what I am trying to say is that not only the proliferation of p-values and especially the use fixed risk levels, but also many times how results are presented, is the reflection of a much bigger problem: statistics being taught as a mechanical and exact science based on clear and fixed rules. Oversimplifying the subtleties and degree of subjectivity involved in any data analysis, especially in relation to what assumptions are reasonable or not, and how any experimental protocol relates to which assumptions are tenable or not, is simply not teaching what would be the most useful training for anybody doing experimental research. So, in my opinion, yes we need to understand much more than basic statistics in terms of principles, but this does not mean that we need to know advanced statistical procedures unless we use them or assess work that uses them.
“You can please some of the people all of the time, you can please all of the people some of the time, but you can’t please all of the people all of the time”. Abe Lincoln (Ironically: Honest Abe)
We have to live with ourselves though, and I have never been one for going to bed with a bad conscience - I guess we have to meet the audiences expectations after all.
“Don't try to make yourself marketable, you'll be surprised to see yourself at the bottom. Stay incognito, and people will peruse the whole world looking for you, by that time, you'll be at the top.” ― Michael Bassey Johnson
Kevin - as for Albert E ... you can't argue with that ;-)
Just arrived home. Found a copy of a book that I had ordered some days ago. From just reading the preface it looks very very promising as a possible basis for a course that would address the concerns some of us have voiced above. The book is Diggle, Peter J., Chetwynd, Amanda G. (2011) Statistics and Scientific Method. Oxford University Press, Oxford.
Pedro: I think your comments are very pertinent. Of course we need the whole range of statistical tools in different settings but you talk about these student summaries of boring (basic) statistics - perhaps if students related their results better to real world meaning (in words, not just an indicator of statistical significance), then we might grasp the importance of what they are trying to say. I think many new students approach statistics from a very isolated angle - yes there is a marked correlation between variables X & y which is statistically significant in terms of it meeting the detection parameters of the test (so it gets flagged up & bought to our attention). The interpretation of this however is not a function of the statistical package & this process often shows the students grasp of their subject. If basic statistical tests & results were more clearly reasoned and interpreted by early stage researchers, then I think these works would be possibly more meaningful & more easy to assess. That is why I argue that we should initially concentrate on the how & why of basic statistics & not so much on how they work. My computer (by all accounts) produces some quite good stuff, but I have limited knowledge (& less interest) of how it works. As with most tools, it is how it is used that gets results. Most statistics books contain the caveat that the interpretation of values is the sole responsibility of the user, so I see things like SPSS etc. as tools - although the need for background knowledge grows with complexity, the creators wouldn't have created a push button system unless they wanted it to be as easy as possible to use (not understand). That takes us back to teaching the 'how' & 'why' & the overriding importance of translating how something marked as 'statistically significant' relates to the real world ;-)
Bjoern - well said. With regards to (2) communicating statistical analysis, check out the books/work of Edward Tufte.
To echo some prior comments...
Yes, good as a tool so long as you know appropriate to the task and what the results mean (as well as what they "don't mean"). I would simply add that many times we use basic and advanced statistical tests and distributions to better understand physical systems (e.g., atmospheric or environmental behaviors) in terms of their attributes and relationships.
The tools help bridge the empirical data (only when used properly) in order to form or advance a conceptual model which can be expressed mathematically for the purpose of running a numerical model. The caveat is that you would not go through that entire sequence of study unless you had reliable statistics that were physically meaningful.
Getting students to understand this approach takes quite a bit of time as they view the statistical aspects as 'steps' that provide 'answers' rather than tools that help us to discern properties/behaviors that can be used to construct the conceptual model.
The first and most important thing to learn with statistics is to understand how the whole pattern of data can be described--and this BEFORE (adequate!) inferential statistic models are applied to the dataset. First "make friends with your data", so understanding the whole configuration, how the relations among the variables are about, how the obtained pattern in the right direction is in general, and so on. This means: first hard work to understand the whole to be able to develop the adequate strategy for further and deeper data analysis.
An excellent article on this idea can be found elsewhere by Dan Wright:
Wright, D. B. (2003). Making friends with your data: Improving how statistics are conducted and reported. British Journal of Educational Psychology, 73(1), 123-136.
CCC
Good article CC - I realised that my initial question was perhaps phrased a bit incorrectly, so I have modified it to specify the context of 'routine investigation'. I realise that we of course need advanced statistical methods (& knowledge) where we have a complex issue to explore. My gist though was that we tend to over-complicate matters that can be well explained by basic statistical enquiry & a clear use of language. To my opinion, this would make work more beneficial (& understandable) to a far wider audience, which can only be of benefit.
Interesting last sentence of the article attached by CCC!
Conducting data analysis is like drinking a fine wine. It is important to swirl and sniff
the wine, to unpack the complex bouquet and to appreciate the experience. Gulping
the wine doesn’t work.
Try to explain that to "busy scientists" ... who are using kits, pre-made methods and large screening without really feeling fields they are studying!
Although I also find this sentence nice and noteworthy, I want to point out that we must take care not to mix two quite different approaches of "science": the exploratory and the confirmatory approach. The vast majority of scientists in the biomedical field seems to follow the confirmatory approach: well-defined hypotheses are set up, all relevant properties of the data and the effects are assuemd to be known, and a smiple yes/no decision about the hypothesis is strived for. This is something different to the exploration of data, even if the data is generated in well-designed experiments based on some hypothesis. The exploratory analysis is more open-minded but also more difficult and complex, and, in case of doubt, needs a lot of expertise not only in setting up a sensible experiment but also to learn from the obtained data.
A good portion of statistics is about whether we have a "good enough" yardstick to measure something in the face of variability. So "significance" only says that "we can measure it." Big things can be measured with rough yardsticks and little things may need more precise yardsticks. Whether something that we can measure is important or of any consequence is entirely outside of statistics, must make sense in the application, and usually has a simple explanation. Coming up with the simple explanation involves understanding the measurement technique. So it pays dividends to use simple techniques in routine investigation of big things!
Agreed. It also raises once again the concern whether the data that are available are representative in time and space of the phenomenon or system of interest. If the initial data set sampling is inadequate there is not much value to the analysis...other than for that specific set of data, circumstance, and metadata.
To the point by Nicholas (and others), if a t-test is appropriate and you can do a t-test, OK, but only a statistician is qualified to make that decision. After more than 30 years working as a statistician in drug development, I don't practice medicine, but know enough to follow most basic medical discussions. Those with a limited knowledge of statistics should follow similar philosophy
David - good point. I think knowledge should be commensurate to the needs of your situation. As JW differentiates, my original question was referring to basic exploratory statistics. Of couse, you should know the rationale and ramifications of any test you undertake, but whilst a statistician is qualified to select appropriate methods (and IMO is a much needed source of professional support), I often feel that: a) people think that unless you know the 'workings' behind a test, you cannot be seen to be using it appropriately (see my earlier comments about my calculator, computer & SPSS) ; b) many simple exploratory / descriptive statistics are easy enough for non-statisticians to employ, and as long as the user understands what the test does and means, then 'push the button' approaches of programmes like SPSS provide a practical & utilitarian approach to obtaining an analysis measure. How this measure is interpreted and explaned however (beyond a raft of SD, p, n, etc) is very often missing or inadequate. Therefore the presence of such statistics is often (as described by others and in literature) ... meaningless. IMO, most research articles are about the topic, & statistical analysis only goes to support an argument and provide bases from which further enquiry can be launched - it is not the focus. Therefore, whilst the 'conceptual model', 'predictive capability' etc. are all very important elements of the tools themselves (& often require expert guidance by statisticians); much of our basic exploratory, descriptive studies do not require these tools/approaches. So, to include them (and the complicated means by which they work) does not serve a productive purpose.
Nicolas, I will give you an example (I recently experienced) that way apply to both, descriptive/explorative and inferential statistics, and it shows the key problem (a lack of statistical thinking) and how this hinders good research:
A researcher was interested in the growth rates of two virus strains in cell culture experiments. Data (the virus RNA copy numbers, determined by some molecular biology method) was measured after some time points after inoculation. To get some idea about the variablitity, 3 experiments were done in parallel and measured at the same time points. I think 7 or so time points were considered, 3 of them so late that the growth was already in plateau.
Now the analysis of this data was a series of t-tests (one per time point) comparing the values of the two groups.
The question for the groth rates was not adressed at all by this analysis, the result is almost meaningless. The researcher has not learned anything about the growth rates of these viruses.
Having had some ability of statistical thinking, one would recognise that the growth might be modelled as a logistic function (or, simpler, as an exponential function in times the growth is far from the capacity). [One would think if the question for different capacities might also be biologically relevant, but that's a different topic]. So one would see that it is best to concentrates all the measurements at early time points bfore the plateu is reached. A preliminary experiment could have indicated a suitable range. Then an exponential model could be fit and the groth rate could be estimated directly, using all the data (not only data from one time point). Further, it would be clear that replications at a particular time point are not improving the result. If 7x3 = 21 measurements (as done) could be faesably taken, it might have been advantagous to take single measurements at 21 different time points, to get a more precice idea of the range (in time) of exponential growth.
When I saw the analysis that was done by the researcher I had to think about this sentence:
When a hammer is all you know every problem looks like a nail.
Basic statistical thinking is very important for researchers. A lack of it frequently leads to a failure of even answering the interesting questions at all, irrespective of the kind of analysis (explorative or inferential).
I am sure that SPSS would even have a button for this problem, but the key is that the researcher is not able to mentally formulate the problem and thus seek an appropriate solution (what might be obvious, clearly described, and simple to apply). In this case a simple linear regression on the log values might have been good enough, and the correctly specified model would directly give the difference of the growth rates as the interaction of "time" and "virus type", staight away with its standard error -or better- confidence interval. I do not need to stress that the researcher is not aware of "linear models", "interaction" or even "confidence interavals".
Don't get me wrong! The researcher is not stupid. He is a very smart virologist (surely much smarter than I am), and he does perform great experiments to increase our knowledge about virus metabolism and host reactions. It's only that such people could be even more productive and get clearer (easier to interpret) results, possibly with less effort, if they had a little more "statistical thinking".
PS: I discussed this with him and offered the alternative. He refused with the argument that the reviewers are used to this kind of (wrong) analysis and that he will have unnneccesary trouble to publish the correct analysis.
Thank you Jochen again for this nice example!
I think that this example illustrates how some researchers try to reinvent the wheel.
That also illustrates that good lab protocols do not include only how much water they should add in their plastic dish, but also explain the expected read out, how to report the data and how to analyze them.
"When a hammer is all you know every problem looks like a nail."
That made my day :D
Great questions and also great answers! But it does not change current situation: statistical analysis itself is an obvious part of experimental studies. In my opinion ANOVA, ANCOVA, MANOVA, etc. are simplier to use than previous "traditional" tools, but researcher should know, be able to choose and apply all of them. But number of tools increases with each day :)
Dear Dr. Mikołajewska:
I am not certain what you mean by "previous 'traditional' tools" compared to "ANOVA, ANCOVA, MANOVA, etc." Analysis of Variance has a very long history, but its modern formulation owes itself especially to work by Laplace, Gauss, and Quetelet in the 19th century. When Fisher's (1925) Statistical Methods for Research Workers put ANOVA & ANCOVA permanently on the map as a central research tool, he and "student" weren't really saying anything new (rather, his work on the analysis of variance was, knowingly or not, almost completely a rediscovery of mathematical developments of the late 19th century especially by T. N. Thiele. In 1931, several relevant papers were published.
1) A. L. Bailey's "The Analysis of Variance" was published in Journal of the American Statistical Association, in which he states: "The analysis of covariance is the extension into the field of two variables of the methods for the analysis of variance of one variable discussed by R. A. Fisher in his Statistical Methods for Research Workers. It is not primarily a new method but rather a simple and uniform way of computing, presenting, and interpreting the statistics for a wide group of problem." (emphasis added).
2) The "Analysis of Variance" section of Fisher's book was included in a review of recent developments in mathematical statistics written by the notable statistician Hotelling in a paper "Recent Improvements in Statistical Inference" (Journal of the American Statistical Association)
3) The lack of sufficiently rigorous treatment in Fisher's book was more than made up for especially by J. O Irwin's "Mathematical Theorems Involved in the Analysis of Variance" (Journal of the Royal Statistical Society).
The first reference I know of to ANOVA and ANCOVA as standard tools comes only three years later in the 1934 paper "The Distribution of Quadratic Form" (Proceedings of the Cambridge Philosophical Society) by Cochran: "Many of the most frequently used applications of the theory of statistics, such for example as the methods of analysis of variance and covariance, the general test of multiple regression and the test of a regression coefficient..."
In other words, I'm not sure how much more "traditional" you can get than ANOVA, ANCOVA, etc. In fact, far superior methods were developed in the first half of the 20th century but couldn't be used because they required computers: ""Many of the statistical methods routinely used in contemporary research are based on a compromise with the ideal. The ideal is represented by permutation tests, such as Fisher’s exact test or the binomial test, which yield exact, as opposed to approximate, probability values (P-values). The compromise is represented by most statistical tests in common use, such as the t and F tests, where P-values depend on unsatisfied assumptions." (from the first page of Mielke, P. W., & Berry, K. J. (2007). Permutation methods: a distance function approach (2nd Ed.). Springer.).
Personally, to me the "non-traditional" tools are things like evolutionary algorithms, multidimensional scaling, classification and clustering algorithms and techniques, fuzzy set theory, ANNs, SVMs, etc. Not tests that, while they were certainly not dominating research prior to the 20th century, have since become so ubiquitous that over 50 years of increasingly critical research hasn't much dented their mindless, routine application. And while ANOVA or MANOVA and other standard methods are simple, many of these are still regularly misused.
You are absolutely correct that statistical analysis is a fundamental tool in scientific research. However, the statistical methods and tools available are quite vast while those generally employed are not only limited but are mostly a selection from statistical developments from the early 20th century. Much research both fails to use superior tools/methods available and misuse standard statistical tests/methods.
Dear Dr. Salustri:
It's certainly true that news reports, debates/arguments, and even advertisements relying upon or involving statistics are as frequent as they are misunderstood. Even people who had some exposure to statistics and/or probability when it school do not usually retain what they learned for any extended period of time. More importantly, probability, statistics, and even logic run are about as far from "common sense" and intuition as is possible. Humans are not particularly inclined towards formal reasoning. There have been decades of research and countless studies concerning how, when, and why humans fail to comprehend rather basic elements of probability, consistently draw false inferences, and do not reason logically. While a lot is unknown, we do know that very, very few people can understand the kind of formal reasoning necessary to "get" statistics simply through exposure, and most need more than pre-college education and some intro stats course just to get the basics. It would be great if we could restructure primary and secondary education such that we stopped trying (and failing) to teach most useless mathematics and instead taught CONCEPTS from probability, logic, statistics, and the foundations of mathematics (set theory, algebras, proofs, etc.). Personally, I think this to be overly optimistic. I think we will continue to teach topics like matrix algebra to 15 year olds and require science majors to take 3 semesters of calculus without actually teaching it.
That said, a general misunderstanding of statistics isn't the same as a general misuse of it in expert literature. Also, there is a vast difference between the kind of knowledge and familiarity required to understand what margins of error are and understanding how to interpret dispersal of points in 100th or 1000th dimensional space. The average individual isn't plugging data into SPSS, SAS, even Excel, choosing X method/model/test because they were taught to associate Y research design/question with it. When a consumer fails to understand the fallacies of some statistical argument, it is unfortunate. When researchers produce studies making these fallacious arguments, it is far more serious. The kind of understanding of statistics that researchers in most fields require is qualitatively different from the kind of familiarity with probability and statistical theory your average individual could benefit from. The kind of educational changes needed to vastly improve the average individual's ability to critically assess the statistics general audiences are exposed to would do little to nothing for researchers.
Andrew: This is one of the things that prompted the question - When researchers produce studies making false or unsubstantiated statistical arguments (either deliberately or inadvertently), this is often missed because they have shown (superficially) an 'appropriate level of statistical / scientific detail' - a raft of numbers, tests & tables which suggests that something meaningful has been undertaken. If a more transparent approach to demonstrating their argument were taken (using perhaps more simple analysis & clear argument), then such gaps may be more easily spotted. Most reviewer 'experts' will not admit that their statistical expertise is limited, so lots of window dressing may well be an effective means to demonstrate 'rigor' (even if the tests and understanding may in fact be inappropriate). This is especially true of specialities that although based in field like arts & humanities, are increasingly adopting a scientific (pseudo-scientific) image, under the belief that it increases the standing and quality. I think your replies clearly demonstrate the difference between the application (& need) for statistics in the true sciences, & the patterns of adoption by outside fields.
As a sustainability practitioner with an interest in how academic perspectives can facilitate my work, rather than an academic, my feeling is that the level of understanding of the intended audience should be clarified from the outset. In assessing impacts, it's all too easy to 'demonstrate' that a corporation or a process or a system is reducing its impacts, when all that is being done is reducing relative, rather than absolute, impacts. Changes in organisational size, structure and function can be used to obscure real trends. I read a great number of articles in reasonably reputable practitioner journals that raise more questions than they answer about what the results actually mean.
The statistical methodology I see in sustainability work is generally at the most simple end of the spectrum but, even then, can be difficult for even relatively well informed occasional users to appreciate (a recent case of SO2 concentration monitoring in flare emissions at an oil facility and in nearby villages comes to mind), apply and interpret properly. In areas of research which have direct relevance to practitioners, it's essential to be able to assess the way that the typical user will interpret either the findings of the research. I would suggest that the choice of methodology and design of experiments and projects in certain areas of research would often benefit from some preliminary work with possible non-academic end-users to determine the most robust way to analyse and report findings.
I appreciate that academic research has a different objective than performance reporting in corporations and organisations, but in areas where the links between the two are particularly close, the downstream consequences of the chosen statistical method would be better understood if they are assessed to some extent in advance. In most cases, this would be best served, where there is a choice, by using simple analysis over more complex methods, or by removing the need to cite and interpret statistical analysis if at all possible.
@Filippo and @Nicholas, thank you for your comments.
It's years since I've done detailed statistical analysis in my own degrees, but I retain enough knowledge and understanding to know how easily abused statistics are. I would suggest that a simple test might be to see whether a narrative description of the results of a statistical analysis is (a) easy to understand in its own right; and (b) makes sense when read alongside the analysis itself. That would be a good test of the level of understanding of the researcher as well as the effectiveness of meeting the necessary level of communication to a wider audience.
Andrew- this clarity is pretty much the issue. No one argues against the need or place of statistics, yet all too often we read stand alone terms like '... the level of X was statistically significant; the SD was ...; & the ubiquitous p=0.005'. What is not said is whether there is any relevance or significance to the topic under discussion (& why), why group X differs from the rest of the main population, etc. This applies mainly to the descriptive statistics commonly seen (& required) in PhD work. Instead of a short thesis & 4 or so articles, I would rather see a longer work that properly describes the findings & their actual relationship to the topic. If tests are useful in revealing & describing information, then by all means run them, but we can only demonstrate that we have understood the test & its results by giving evidence of its interpretation & context. Appendices of numeric 'proof' is often taken as scientific rigour, but I think clear reasoning & expression are as - if not more important. If we encouraged clarity over pseudo - science, then we might get more meaningful & useful research 🙊
I agree with much of what has been stated already. I might add to this that it also makes a difference in how large, or important, an effect one is looking for and would find valuable/useful. Many of the more "advanced" statistical methods are really best suited to finding very small amounts of variance (used to represent effect here) relative to the "less advanced" models. While this is indeed interesting, and perhaps valuable from a theory development or basic science standpoint, the current practical application is often questionable. This does not mean that theory development or basic science is not valuable, but the plethora of applications of these advanced models in applied setting makes me question what the purpose was for doing so. There are times I think that publications become more of a showcase for statistical modeling and less about finding practical application.
Dale - well said. It does indeed make you wonder who we are writing for & why :-)
You should never be comfortable doing something because you were told to or because convention told you to.
After years of doing what statisticians told me to do, I entered my PhD program with the intent to make the statistics I use my own, and I was fortunate to finally have a statistics professor who finally made it click for me. To top it off I married a PhD statistician.
I try to avoid asking her for too much help, but I thankfully have her for gut checks when I'm stuck on something or trying something new.
She tells me I am better than 95% of non-statisticians, but sometimes I make her nervous. She teaches statistics to non-statisticians (natural resources and engineers) and I've given her a lot of feedback about what is most helpful to students as she's developed courses; she's backed off a lot of the foundational math (expected values and what-not) in favor of teaching people how to know the right statistics for a given problem and when to know you're over your head. Then how to ask the right questions of a statistics consultant, how to read the results from a statistics consultant.
There are a lot of designs that I use that I apply regularly and have a good degree of familiarity. But the distinction between her and me is that she knows the language of statistics, and while I can accomplish a lot, all I have is a phrasebook. I like to think that I know when I'm over my head, and I am finally to a point where I can make reasonable attempts to teach myself methods I didn't learn in school as long as it doesn't get too complicated.
The one misconception that many people have is that statistics is a static field with static methods. It isn't algebra. There are new methods being developed with specific tools for specific problems. As a for instance, people still use split plots for repeated measures analysis, but mixed models are more appropriate and becoming more prevalent in the literature. Bayesian analogs to basic Gaussian methods are becoming more prevalent in ecological problems (and may be preferred because they can use a prior knowledge to improve the model), but is only recently has computing power has only recently made it more assessable.
There are also always more than one way to skin a cat; as reviewers we should be careful about criticizing work that, while not the way we would have approached the problem or might have some technicality that makes it better, isn't technically wrong. If you're operating in a system where slight changes in methodology cause dramatic differences in meaning/conclusions, then you probably should have a statistician on board the project.
George Box "All models are wrong, some models are useful"
Great discussion! If systematic observation is used throughout an experiment, then the reporting of changes in what was observed should suffice to answer the research question(s). Earliest examples of the appropriate use of scientfic method can be found with John Snow's removal of a pump handle to stem the spread of disease (http://www.ph.ucla.edu/epi/snow/removal.html) and James Lind's cure for scurvy (http://www.bbc.co.uk/history/historic_figures/lind_james.shtml), These are 2 examples in which properly conducted research actually advanced the public's health, and both without extensive use of convoluted statistical testing. Especiallly with easy access to desktop statistical programs, people today who can run these programs think that is enough to impress others. Without a proper understanding of research methods and how to interpret statistics, reams of output are totally meaningless. It is not enough to learn how to run statistical programs, without a firm foundation in research methods, data management and statistical analyses. Pity the public that has to read research papers that are published with meaningless statistical tables with nebulous interpretations, or none at all! So, in response to Nicholas - Occam's razor should rule the conduct and reporting of research. Finally, I have a few webpages devoted to research and statistics. Feel free to explore.
http://www.bettycjung.net/Study.htm http://www.bettycjung.net/Statfxs.htm http://www.bettycjung.net/Statpgms.htm
Occam's razor is apt when there are competing hypotheses. If we produce complicated predictive statistics (that are perhaps innappropriate for the case in question), then we are in danger of pseudo-science. It has earlier been commented that: "When a consumer fails to understand the fallacies of some statistical argument, it is unfortunate. When researchers produce studies making these fallacious arguments, it is far more serious". I think this argues both sides of the coin quite effectively. Firstly, if we make things overly complicated then we make it difficult for the consumer to understand. I think however that it is wrong to attribute this misunderstanding to some (statistical) knowledge deficit on the part of the reader. Many readers just want a clear explanation of what is being argued, & the raft of statistical detail that is 'required' often prevents this. I do not think however that researchers aim to make fallacious arguments - rather, their inclusion of statistical 'evidence' often leads them & others to believe they have made their case. Luckily, there is a small contingent of knowledgeable statistics bods who can see through such associations, .... but I would venture to guess however, that despite being experts in our fields, many of us are not so well equiped. We read an article & simply come to the conclusion that the reader has simply not said enough to convice us of their argument. I do not think that this type of critique is any less than one supported by highlighting innappropriate use of statistical method. What is important though (in either case) is that we have failed to meet the needs of the 'consumer', and surely, this is one of the primary aims in undertaking & reporting research?
What's complicated? Are we talking using Generalized Linear Models in the place of ANOVA? The advantage of "complicated" techniques that confuse most people is it permits us to properly deal with non-normal data without resorting to even more confusing and obscure transformations of yore.
Statistical methods may be window dressing, but regardless of an experiment's statistical significance, there is still the matter of practical importance.
Mark: here I am referring to how the inclusion of reams of statistical information ** without any clear linkage to what they really imply or why the tests are merited **. This makes trying to work out the validity of otherwise basic issues unnecessarily complicated.
As long as the content of courses essentially culminates in the importance and application of hypothesis tests we will never manage to educate students in statistical literacy or statistical thinking.
I was once in a class where a straight-A student frustratingly asked why we don't calculate 100% confidence limits.
and for some levity
http://phdcomics.com/comics/archive/phd051809s.gif
As the research practitioner, you may need to. However, the end user needs to a have a clear understanding of the interpretation of the analysis.
Aeron: sorry for the very late response (difficult to keep up with older threads). I think the point about end-user comprehension is very important. I would venture that a huge amount of people have limited statistical knowledge. Unfortunately, this has become interpreted as an intellectual deficit, so people rarely admit to it. I think that to assume your readers have an in-depth or working knowledge of unexplained statistics is ... an assumption. Even when we assert that we write for a certain level / field, I think that the tests we run and report have to be clearly explained in words, and not just left as numerical statements. Not only does this help explain the context and application of your work, but it also shows that you have correctly chosen, interpreted and related the results. I think this is an important part of academic education & also needs to be stressed in statistical teaching.