How can we deal with HARKING (hypothesizing after the results are known) and other such issues without pre-registration? Or is pre-registration more than just a pipe dream?
Maybe one useful step in the right direction is for NIH (actually most HHS-funded experimental research) to allow registration of hypotheses for all studies, instead of just NIH-funded research. As it is, I'd like to value papers with pre-registered hypotheses more highly than papers with no visible antecedents (e.g., as an editor), but that would bias me toward funded research.
I'd also like to tamp down polarization around the topic by acknowledging there is plenty of good work that doesn't require hypotheses a priori - an insightful observational study can raise hypotheses in the discussion. And I don't want work to turn into some Red Guard-style criticism of our prognostic inadequacies. It's the practice of putting together tables before writing an introduction that is dangerous and damaging. I have always insisted to students that they write an introduction before anything else, but I know there are those who leave it until nearer the end of the process.
The last issue this made me think about is how so many scientists are afraid to be wrong and trained to to be afraid. Learning in a sea of papers with well-supported "hypotheses" is not helpful to training people to know where they are wrong and live with it!
Well, it's probably a pipe dream right now, but it has the smack of reality about it. The "selling point" for some of us might be that it gives us a chance to register our cool, creative ideas before we've had a chance to gather data. If (like me) you're in a setting where it is difficult and time-consuming to collect data, this would be a very good thing. As things stand now, if you have a good idea a few months ahead of someone else, but they are in a bigger/better funded lab, they will probably beat you into print. Likewise if you are the only one to have the idea but you discuss it with colleagues and word gets out.
The "down" side, for some researchers, would be losing that ability to look all-knowing and wise by making up hypotheses that fit the results of extensive analyses of large data sets. That sort of thing is a nuisance, to be sure, though in the end most spurious ideas of this kind don't hold up under replication.
So the balance of power is probably against you. Big labs with lots of loose data and careerist types yearning to publish half a dozen pieces a year will resist this idea.
I think it is a pipe dream right now but could become a reality. If the current rate of fraud and non-replicability continue to get press then it may come sooner rather than later.
However, this will never be a complete fix and I doubt if there is one.
in my view, there is a price to be paid for everything. For example, I strongly support the tenure system even though I recognize that some faculty become inactive intellectually once they achieve tenure. So what? It is the price of ensuring academic freedom and there are steps we can take to minimize that price, though we will never eliminate it. Similarly, an experiment registry, like the current clinical trials registry, will help reduce problems but not eliminate them. All this emphasizes what we all should focus on, independent replication.
Imagine if we had a discipline based only on phenomena that have been replicated. Now that is the discipline I want to involved in. So how do we get there? Registration of experiments would be a start but clearly not enough.
I found a good article in a Scientific American Reader addressing this subject last Sunday, entitled "Bad Biomedical Research on the Rise." The article pointed out that the bad reputation of research has led to about 90% of research going uncited. It is important for researchers to find ways of making their work trustworthy, because too much helpful research is going unnoticed.
There is something like this already for NIH researchers who must register their (randomized controlled) trial and specify their primary and secondary hypotheses before collecting data (of course it's in the grant proposal too which obviously was written before funding and data collection). Maybe that concept/structure could be built upon?
Maybe one useful step in the right direction is for NIH (actually most HHS-funded experimental research) to allow registration of hypotheses for all studies, instead of just NIH-funded research. As it is, I'd like to value papers with pre-registered hypotheses more highly than papers with no visible antecedents (e.g., as an editor), but that would bias me toward funded research.
I'd also like to tamp down polarization around the topic by acknowledging there is plenty of good work that doesn't require hypotheses a priori - an insightful observational study can raise hypotheses in the discussion. And I don't want work to turn into some Red Guard-style criticism of our prognostic inadequacies. It's the practice of putting together tables before writing an introduction that is dangerous and damaging. I have always insisted to students that they write an introduction before anything else, but I know there are those who leave it until nearer the end of the process.
The last issue this made me think about is how so many scientists are afraid to be wrong and trained to to be afraid. Learning in a sea of papers with well-supported "hypotheses" is not helpful to training people to know where they are wrong and live with it!
I like this developing idea of having voluntary registries set up. Whether it's NIH, one of the professional associations involved (in my case, APA or APS), or even a journal editor offering more favorable treatment, it could be done. I'm still skeptical about it happening anytime soon, but it starts to sound less like an opium-fueled fantasy.
It certainly won't be quick. But a registry has a number of advantages, including enhancing reliabilty through replication (another conversation in which I got involved). That is, a registered set of hypotheses and conditions could lead other interested parties to run the experiments as well, or run them cooperatively with the originator. Then the field gets multiple estimates of the effect, with the original registry showing the originator of the idea. In public health and biomedical fields, journals often require experimental registry through areas like clinicaltrials.gov before they will publish work.
That sort of system is at risk for the equivalent of patent trolls - in this case, people who lay out hypotheses with no intention of doing the work. So a system would need some safeguards.
The registered set of hypotheses needn't be accessible to anyone but the editor/reviewers when the paper is submitted (i.e., via a link and passcode). The record of the date of the hypotheses specified and any changes to the protocol, analyses, etc. (and reasons for them) would also be specified by date so that there is a record (as I believe t here is now for RCTs).
Some journals require authors to specify who did what (i.e., who gathered the data and who was responsible for data entry, uploading the raw data, data clean up and analysis, and so forth). The registry could specify dates when the data were collected and who oversaw this process, etc. If there are later questions about fraud, etc. it's immediately apparent who was responsible for what step in the process.
I wouldn't mind a public registry of hypotheses, as this would essentially be a compendium of what was going on in the field. It also gives the field a chance to evaluate and refine the hypotheses as well as react to findings and enable independent data collection around the same hypotheses (get your replication concurrently).
We just need a mechanism to be sure the authors of hypotheses are neither cut out from the work, which incentivizes hoarding of hypotheses, nor given control over the work, which reduces independence and encourages trolling.
Beautiful as this might seem , it could turn out to be counter-productive by limiting the freedom of researchers. It is common-place to see many research papers that treat the same subject, allowing for comparisons of findings. Suppose a researcher registers an experiment and does not get round to collecting the data let alone writing up? Would others lose the freedom to investigate the matter?. The variables in psychological investigations, ranging from case studies, inter and intra-subject variability and contextual factors are infinite to the extent that it could be difficult to legislate experiments (if at all), particularly prior to data acquisition. Imagine, for example, the number of researchers who have worked on music or speech perception in the history of auditory analysis. To this day, many issues are still unresolved and more and more people are undertaking research on the same subjects but with different approaches. An implementation of the idea globally would be hard, and the benefits to the individual or to research are not readily predictable.
I think that it is a necessity to prevent psychology and other "soft" sciences from becoming even more stigmatised. With a pre-registration system it will be harder for researchers to look all knowing as previously described by Stephen but also make it harder for critics to claim this is happening in our field.
Unfortunately there have been cases of fraud and acknowledgement that researchers can sometimes do some incorrect techniques to increase publishing power for instance a paper on U.S. psychologists found 60% admitted anonymously that they have done things like add more participants after seeing non-significant results. I cant remember the paper now but should be easy enough to find (I'm pretty sure my numbers are right but could be wrong).
On the issue of research without hypothesis. I think that this could also have advantages as it would be registered as exploratory and thus having no hypothesis will allow more freedom. It will prevent exploratory studies being done and after data is collected making hypothesis to suit the findings.
All in all I think it can work and is working for other fields but I do think that ethics systems could be used to get it up and running. If every university had a policy that every study that passes ethics is registered then it wouldn't even take that much extra time necessarily. I wonder how anyone else feels about that idea (maybe i'm missing a rather large downside)???
Shane, I think your sentiments illustrate what is needed - there's not so much an unavoidable downside so much as pitfalls to avoid. Akpan Essien identified some of them as well. What I would like to see is a set of registered hypotheses to which others could respond: (1) through potentially refining the hypotheses and (2) through using the hypotheses to guide their own parallel research. In an earlier post I alluded to concurrent replication, and several groups working from the same hypotheses could generate a a series of research results that test the hypotheses under varied conditions and so forth. And it certainly wouldn't hurt *me* to get feedback on how to best specify and theoretically support my hypotheses. A registry covers that.
I would avoid compulsory registration and so forth. Also, there needs to be a way to acknowledge the person whose hypothesis one is using. Finally, managing any registry - fraud, assuring novel hypotheses, etc., - is not trivial. But I think there would be substantial benefits.
A lot of people are writing about ownership of hypotheses. Some ideas are very unique and to lose credit for them because you don't publish before someone else tests would be extremely annoying and cost you a lot in terms of publication and career prospects, perhaps there is some way to pre-register but withhold access until publication or if the researcher permits it earlier. This way you could still credit the idea and keep it until you can do something with it and the pre-register would still exist so no one could argue you tested widely and reported what worked. At the same time it would stop any arguments when people independently come up with the same or similar ideas as the hypotheses are withheld it would only be by chance that someone else had the same idea (or hackers or you told someone or any other normal way too :) )
I think we should value avoiding Type I error so much that no progress is ever made in social psychology. Don't let anyone tell you that privileging Type I error avoidance has Type II error consequences. They're weenies. Pre-registration is an excellent component of the never-make-any-errors/statistics-should-rule-inquiry strategy. Curious surprises should have no value in our science. We should stigmatize discovery in every way, and elevate only ideas that we had before we collected the data. This will ensure that our science is as derivative as possible.
Surely there are no costs to pre-registration, and it is a value-free way to improve science to the point of sheer solidity.
Chris- Curious surprises and rigorous hypothesis testing (where the hypothesis is formulated prior to data collection) are part of the same bigger process. Exploring data and coming up with neat findings is a great part of science -- but we then shouldn't frame the intro to such papers so as to suggest that we had this idea a priori (when we didn't). Basically, we just have to tell it the way it was/is. On the other hand neat surprises from exploration can form the basis for subsequently tested hypotheses that can be registered as such. In the later case, these provide a record of potential negative findings (useful for meta-analyses) and should have advantages (e.g., in terms of using one-tailed tests for example).
Lynn: I just don't think registration adds anything at all to scientific progress. It only slows things down. It creates a false dichotomy between the "context of discovery" and the "context of confirmation." I am not against anyone doing it. I am against anyone discriminating against non-pre-registered hypotheses.
I am firmly convinced that this will pass.
Furthermore, there are many contexts in which preregistration is simply impossible. Why should we kick this to the side?
There are certainly problems in science to be solved--all is not peachy. But I think pre-registration adds difficulty, with very little reward.
This is, I think, an anti-zeigesit position. But I take it thinking I'm on the right track. I know that others differ. We'll see who prevails--I promise to be gracious in loss, and irritating in victory. :-)
Chris- Yes, it's true there's that gut reaction that might go like this..."oh, hell, what now are we being asked to do? And, boy isn't this taking the spontaneity out of the whole process?" I get it. But, having had to write grants (and conduct research based on those pre-stated and preregister hypotheses for funded clinical trials with sufficient n's based on pre-study power analyses) I have come to appreciate the advantages of pre-registration. For one, I remember being taught once upon a time about one tailed versus two tailed tests...but who can publish using one-tailed tests anymore -- the gist seemed to be that no one trusts that these one-tailed tests really reflect an a priori directional hypothesis . But, those who really had a priori hypotheses were really getting the short end of the stick here. With pre-registration those studies should be "privileged" as involving a priori directional hypothesis -- with implications for power. So, maybe those proposing pre-registration should also be simultaneously advocating for such "upsides".
I also know that many of us were told by editors over the years to state as an hypothesis what was really a question...with the justification that this makes it easier for the reader.... maybe so, but it muddies up the waters in terms of being transparent about what was really an a priori hypothesis and what wasn't. I think it's all about transparency...and transparency about our processes are -- I believe -- critical to credibility...and good science.
For those cases where there's a just in time need to pre-register (not much time to catch a timely phenomena) why not create easy mechanisms to do this. The bottom line is that consumers of science have the right to know if an hypothesis was really a priori or not. The current system, without preregistration muddies the waters and quite frankly disadvantages research hypotheses that truly are a priori ones.
Perhaps your point of view Chris will prevail, but that's a different issue it seems to me then what is in the best interest of science and the credibility of our work (and a given field) to the broader public.
Lynn: My resistance to preregistration is not a gut reaction, it's not based in status quo bias, and it *is* based in the sense that preregistration will be harmful to scientific progress,
I agree that there is substantial value to the kind of thinking that goes into grantsmanship. It's fantastic. But it's not ALL of science, and it's not required, and it's not always the best thing to do. It's *one* very good thing to do, among many.
I've stated elsewhere (and often) that the preregistration of clinical trials is an absolute must. It's entirely obvious, and not a point of debate among much of anybody, I don't think.
And furthermore, the reformers pushing bigger N's--totally makes sense. There is opportunity cost to big N, but it strikes me that the balance between that cost and that benefit is on the side of a larger N than we typically see in journals.
But preregistration is based on the notion that a priori hypotheses are somehow privileged. As you point out, even the one thing that they really deserve--one-tailed-tests--aren't on offer anymore anyway.
You want me to believe that the truth value of a statistical test is affected by the mental state of the scientist prior to the collection of data. I simply don't think so.
And furthermore, I think that the costs associated with many of the "reforms" are much, much greater in terms of Type II error than they are at solving Type I error. I do not support the fetishization of Type I error, and furthermore, I do not support the fetishization of statistics in scientific progress.
I have no objection to people preregistering. Go ahead! I just don't think it does much of any good. (Excepting clinical trials, where it is essential.)
Maybe it would help to stop talking about "pre-registration" and just think about putting hypotheses in a registry. Chris, I am not sure that the issues you have around error rates, the use of statistics and so forth have much to do with this thread. One could just as easily argue that an open list of hypotheses - generated from whatever combination of theory, formative work or "curious surprises" found in the course of other work - should open the field up a bit.
I don't think anyone wants you to believe the truth value of a statistical test is affected by the mental state of the scientist prior to the collection of data. I do think people want you to believe that the truth value of the hypothesis induced after the findings are known is unknown without replication (HARKing obscures that problem). Replication can certainly be managed without a registry, but I don't see that one hurts the field. After all, what's a journal other than a list of findings in the field?
Matthew: Thanks. I do prefer the language of your first paragraph.
But, the first two sentences of your second paragraph are, to my reading, at odds with each other; the second negates the first.
Replication is everything--I think we all agree on that. But replication is simply just as necessary after a preregistered empirical observation as it as after an unregistered (or pure surprise!) empirical observation.
If preregistration interferes with vigorous data exploration (or devalues it substantially), then it unnecessarily increases Type II error. And that make it A Bad Thing.
If we can agree that registration of hypotheses is sometimes a good practice, and not necessary all the time, then we're on the same page.
I don't understand about how those two sentences are at odds with one another. One is about the content of a test - e.g., the extent to which two variables are related to one another. That's independent of the mental state of the scientist. The second is about inducing the reasons for why two variables are related to one another on a test after the result. That is very much dependent on the mental state of the scientist. A good scientist will recognize that the induced reason is itself a hypothesis that should be tested in a rpelication study, hopefully against a decent counterfactual. A bad scientist will write a post hoc introduction that sets up the path to the result as though it were designed a priori.
I agree with you about the general importance of replication and hope a registry would aid the process as an information-sharing tool. In fact, with enough readers exposed to the hypothesis and encouraged to conduct the studies, the question woudl become an issue of "under what circumstances does this finding hold" - less a quasi-fraud investigation dampening curiosity and more about, say, how fundamental *is* that fundamental attribution error.
Journals are a registry of hypotheses - just a biased one in that the hypotheses are supported at least once and replication discouraged from the POV of publication (which is a different issue).
BTW, this issue may be even more prevalent in biomedical research which is, of course, where RCT registration before conduct is considered essential (you referred to this). Good luck to us either way.
I respect your position, and agree with much of it.
I do want to take issue with one point, which applies to my field of social psychology. Journals do not discourage replication, per se. Quite often, in my field, the highest value journals include multiple replications within the same article, when feasible.
I just checked 7 of my Journal of Personality and Social Psychology articles, and all 7 of them have built-in replications of the major points (but not all of the minor points). This includes two articles of difficult field settings (sororities and binge eating over the course of a year, and development of stereotypes among Army officers during training at Fort Leavenworth). I don't think I'm particularly attuned to replication more than my peers. We love to be able to show it.
Direct replications of already-published findings by a different lab--that's tough to publish in the same level of journal. But replication per se, including conceptual replications and attempts to extend the hypothesis? Very common.
Fair enough. Psychology journals are much more tolerant of multi-study papers with a series of studies than are most biomedical or public health journals. In fact, doesn't JPSP require multiple studies including replication unless there are extenuating circumstances? I used to get PSPB before I let my membership in SPSP lapse, and many of the papers in there are multi-study affairs, too. Weren't/aren't you one of the editors?
I got my PhD in social psychology, too. It is a useful discipline in public health research, well, in conjunction with a few others.
A few issues:
1. Registering hypotheses is simply a way to make it clear which hypotheses are in fact a priori. I agree that articles that do a first exploratory study followed by subsequent replications should not be "valued" less. But, the assumption here is that the subsequent replications are really a priori hypotheses anyway. So why not register them.
2. Registering could be a simple process that might even leverage other things we have to do anyway when we run a study. For example, don't we all have to do this in our IRB submissions before runnings studies? These generally are dated. Why don't we just tie into that process? And/or have IRB materials submitted as part of a supplementary material for reviewers.
3. I applaud your consistent multiple replication efforts Chris. That's great science and much to be admired! I certainly think such work would not be "discriminated against"
4. Although I agree it is less likely to be a problem with multiple replications (although the number of non-replications of exact replications from other labs in some recent projects really raise concerns), I can imagine hypotheticals that would suggest that even when a researcher has replications it needn't mean that the later work involved a priori hypotheses and it could mean that alpha was, perhaps inadvertently inflated. Registration could actually protect researchers from reviewers' potential issues/concerns here.
For example, maybe it's my overactive imagination...but I can easily imagine what could happen...and maybe does...at least to some extent. My hypothetical:
A researcher runs a lot of experiments (let's say 5) involving relatively simple stimuli perhaps over the web all at the same time with a lot of conditions (let's say 16 each) with many measures (20 let's say) --then looks across studies and sees that there is a consistent pattern for a 2 x 2 in the mix across 3 of the studies involving 3 of those variables but not the others. Then the researcher says, "oh right, my hypothesis was really x that happens to jive with this 2 x 2 pattern in 3 studies for these 3 variables -- I can tell a nice story here...I'll just not mention those other conditions, cells, and those other studies and those other variables or that this really wasn't an a priori hypothesis in the first place in any of these "studies". And besides I would have expected the pattern with those other variables too but hey I didn't get it so I just will not mention that part. Three "consistent" studies are enough to get this published in journal x, so I'll just write this up as is. Granted, it's unlikely (I hope) that we'd find a researcher doing all these extreme things like this. But, bits and pieces of things like this? Yes, I think that's quite possible.
Registering hypotheses in advance of studies reduces the probability of this type of situation (where authors might be trying to suggest that they have a priori hypotheses when they don't). Furthermore it makes it clear that you were just testing in your follow up studies the hypotheses reported (with the variables used, etc.) Registering would also protect researchers from some concerns from reviewers [registering makes it clear what you're not doing]
Registering hypotheses could be relatively simple as suggested above using dated IRB submissions. This adds transparency and would not be an additional burden on researchers. And, it could alert reviewers to potential multiple additional tests (and unreported inconsistent conditions) with many other measures (not used in the report) that may have really inflated alpha in problematic ways.
In short, registering hypotheses (perhaps through IRB processes/materials) could definitely have benefits for transparency about what is and isn't a priori and have other plusses (e.g., questioning what the real alpha for the studies might be ).
5. Again, there is nothing in what is said above that says exploratory work isn't valued. It absolutely is the life blood for one's continuing research ideas and hypotheses. But, without the next step (testing that interesting hypothesis a priori in a subsequent study), it's not as complete a package as would be expected from some journals.
I am in favor in principle for pre-registering experiments and empirical studies in general. Lynn Miller's thought experiment is not that far-out in my experience. I can two system that might work. First, researchers are required to pre-specify their main hypotheses as well as some boundary conditions. Then they are free to test these hypotheses with multiple studies and data sets that need not be pre-registered. This leaves relative flexibility and may be a honest intellectual approach. The second option would be to register each and every single study with an overview of the methods and measures. The real advantage of this is that researchers cannot cherry-pick their results anymore and it is transparent what variables were included in each and every experiment. This would get around the issues mentioned in the thought experiment. I can see advantages and disadvantages to both of these approaches. Of course, both approaches could still be manipulated, but it should help with the 90+% confirmed hypotheses in psychology. The pharmaceutical industry is a good example demonstrating that it (kind of) could work. Of course, exploratory studies and novel hypotheses that emerge from the data should be possible, but should not be presented as theoretical a-priori hypotheses.
A friend just sent me the link to the multi-labs replication project looking at 13 effects and 36 samples. The upshot is that Danny Kahneman's effects replicate in independent labs - most in a way much stronger than his original findings (amazing!). Most of the other effects replicate nicely too...but two priming effects studies (Carter et al., 2011; Caruso et al., 2012) do not come close to replicating. This despite within-report replications. The X in the figure is the originally reported effect size -- you can see in the figure how this compares to the findings from the independent lab replication attempts. http://t.co/rXb1pXytTY
R. Fischer: Holy moly! Some of your suggestions border on the dictatorial! Please keep in mind the costs associated with your prescriptions. I believe that they are substantial, and the benefits accrued by your proposal would not stack up well to the costs of doing what you suggest.
Really, folks, you think cherry-picking results are the biggest sin in science? I would have thought unambitious research, derivative hypothesis testing, lack of funding, tedious journal processes, tired theories, and so on would be much more worthy of discussion. P-hacking is our biggest problem? I don't think so.
Hence, my words 'in principle'. A lot of us have to go through IRB processes already (as noted above in another post). It may not be too difficult to organize these processes in a way that allows greater transparency of what was originally proposed and later reported. As a journal editor, I sometimes wish I could go back and see what variables were actually included in a study or what the design originally looked like before it is presented in its final format at submission to a journal. I have no problem with some stuff not working out... I just would like to know what did work and what did not work. This does not need to be in the article itself, but it should be available for those people who are interested in it (e.g., researchers trying to replicate or extend research). All I am arguing for is greater transparency - not dictatorial processes that obscure what is happening. This is ideal-world thinking, but as I hinted in my original post, there are lots of drawbacks too and a good number of folks will probably object to it.
I really don't see what the big deal/cost is here. Researchers can do exploratory work find a neat effect, submit an IRB for another set of studies, specifying hypotheses (a priori) and measures in attachments as we do now (with all instructions, conditions, etc.). Researchers then submit their dated IRBs as part of the journal review process (basically they answer the question what was apriori etc.). To me the potential cost of not taking such steps is what's huge (i.e., researchers spending years trying to build on work that isn't sound to begin with; other sciences questioning the value/credibility of our work, potential reduced funding to our field, etc.).
A hypothesis registry is a tool, so I think the disagreement is really about how one views the field (of social psychology for most on here?). Perhaps there are far too many checks that reinforce a staid and unimaginative state of affairs, and a registry in the hands of such institutions will become another means of drying out fresh ideas. We understand this does not require malice - it's the power of the situation. There is plenty of empirical work on social networks that would support this vision.
Alternatively, the field is beset by quasi-scientists, who, knowingly or not, engage in a great deal of shady analytic practice and reverse logic. What looks like editorial risk aversion to ideas is a function of sorting out a small amount of wheat from a pile of chaff. Again, no malice needed. The p-hacking debate is not so much about Stapel-type fraud as it is about the channel factors that lead to publication.
Note these viewpoints aren't mutually exclusive; more about which is the fundamental problem, or at least, the place to start the fix. But if the Place to Start is risk aversion and over-control, then a registry looks like a control tool with inquisitorial overtones. If the fundamental problem is dodgy or just fragmented science, then it's a quality assurance tool that addresses fragmentation through transparency.
BTW, if one used Lynn's IRB idea, then it's a protocol registry that covers exploratory research as well as hypothesis-driven research. Some fields have such journals.
Let me say that I totally oppose a registry for hypotheses for experiments--but more fundamentally I am opposed to the journal obsession with deductive theorizing as
such. It is a perversion of science which progresses most fundamentally thru induction (see my 2007 article in the journal of management), The deductive approach does not lead to theory building at all-it leads to phony deduction and then when the results are in the theory testing process is considered done instead of just beginning. The proper way to do science is to ask questions and then try things. Theory must be built gradually from an accumulation of findings. Nor is direct replication very useful. More important would be replication with variation so you can look at generality. I think the whole scientific enterprise has been messed up by the hypothetico-deductive method. And I speak as someone who has actually built;t a theory.
Edwin: I'm just going to focus on the issue of replication below. Yes, those of us with a strong background in social psychology were all taught what is reflected in your quote: "direct replication [isn't] very useful. More important would be replication with variation so you can look at generality." But, that was before the elephant in the room moved in. What elephant? The one that's wearing a bright t-shirt that says, "Got replication, really?" In short, how many studies in psychology (including social psychology) and in other scientific fields are reliable (that is, other labs can exactly replicate them). How did that elephant get in the room? Various ongoing efforts (some discussed at our top social psychological conferences in packed very tension filled rooms) suggesting that some studies (classic and those published in top journals) aren't reliable for a variety of reasons (e.g., fraud, cheery-picking, details missing, etc.). The elephant (and the fear) is that this problem may actually be widespread -- perhaps affecting 1/3 to 1/2 of findings in top journals. The link I provided above is to a recent multi-labs replication effort (many labs on each effect) http://t.co/rXb1pXytTY. Of 13 classic effects some dating back to the 1950's, 2 clearly did not come close to replicating (one additional was marginal): So somewhere between 15% and 24% (depending on how you could the marginal one) did not replicate and the ones that did not replicate are of more recent vintage. Of the effort (Nosek's reproducibility project https://openscienceframework.org/project/EZcUj/wiki/home/ ) involving more than 150 scientists looking at the year 2008 articles in top psychology journals (e.g., Psychological Science, JPSP, etc.) my read of where they are currently is this: of the studies they have finished doing replications for it looks like as many as 1/2 of the studies will not come close to replicating. That's a mighty big elephant if that pattern continues. Focusing on conceptual replications only works when we know that the original finding is solid (but that's hard to publish by labs outside of the original lab). When the original lab publishes numerous studies with replications and conceptual replications that's no guarantee that the studies will replicate with another lab -- that's apparent in the recent effort just published where one of the effects that did not replicate cross many labs replicated conceptually across many, many studies according to the original authors (http://t.co/rXb1pXytTY). Finding mechanisms to address these problems is the goal. If we don't, it's a problem for the whole field.
As I have said repeatedly (having come up with a theory -- Socially Optimized Learning in Virtual Environments-- that was build from formative research including extensive interviews from the ground up and then tested in formative experimental studies as well as RCTs (3 now)...I appreciate all the tools at our disposal for asking questions, building theory, and testing specific hypotheses...but throughout transparency is key to a cumulative science process and looking at where it doesn't work is as informative as where it does to keep improving our interventions and our science.
The flaw in the argument is that not much should be concluded from one experiment--only from a large number done by different people with different tasks and subjects which show a pattern. Further, the pattern should include causal mechanisms and at least some moderators. Goal setting theory was induced from close to 400 studies(now there are over 1000). I deliberately refused to call it a theory for 25 years. That's why the deductive method (until much later) is hopeless--quickie theories just do not work. Following my method, if an original result could be be consistently replicated, as per above no theory could be formed. See my 2007 article.
People are treating "hypothesis testing research" as something that scientifically different from "exploratory research." This is not a distinction that holds up, or so say the philosophers of science.
Pre-registration does NOT make data more true. It just cannot do that. And so Lynn's concenr that " researchers spending years trying to build on work that isn't sound to begin with; other sciences questioning the value/credibility of our work" is unaffected by pre-registration. It's always a risk. Just ask Pons and Fleischman.
And finally, I take Edwin Locke's comments pretty seriously. I think that most of the "reforms" are based on a philosophy of science that philosophers discarded as foolish and unworkable decades and decades ago.
The possibility of proving a theory to be true: Discarded.
Context of discovery vs. context of confirmation: Discarded.
The possibility of a theoretical falsifiability: Discarded.
What's not discarded? Every attempt to reduced Type I error leads to an increase in Type II error.
Will provide PoS citations if people are interested. But not until next week!
I agree that philosophers of science, the ones I have read, be rejected. In my JOM article I take Popper to task--his delusion is that science progresses by proving things not to be true. In reality science gets built by the accumulation of positive evidence.--almost never by single studies- and the integration of that evidence. The law of contradiction is fundamental--if something does not come out either the method was faulty or the theory or hypothesis (if there is one), as stated, is wrong or is in need of qualification. If the latter, then it implies that the causal mechanisms are not really understood or that there are moderators.I think the failure to pay more attention to mediators and moderators has held back the field of priming. For example, I did (with others) a priming study which worked. Then two attempts by others to replicate in Germany, a few years later, failed. We have no idea why. Then in another study we found that two favorite priming methods did not work the same. Totally baffling. Then we found that people in Europe had primed with geometric figures, with no words. How does that work??? Lots of studies--too many unresolved puzzles--impossible to integrate everything. Direct replications won't solve the problem. We have to know more about how priming actually works..
I have no idea what the Duhem-Quine hypothesis is???I l like David Harriman's The Logical Leap--about induction in physics.
All very interesting, thanks (and, yes, I will look up Locke 2007, JOM!). The two points I'd like to make here are (1) I don't see much difference among writers on the various positions on science and (2) I still think a registry is a potentially useful tool. That is, I haven't seen that any person posting here is ignorant/disdainful of the value of induction in generating theory, and I think everyone acknowledges the place of deduction. I like the comparisons to physics, where there is a famous example. By 1900, there was a substantial amount of empirical observation around planetary motion, enough to generate a cohesive theory. And then Mercury, observed during an eclipse, wasn't where it was "supposed" to be (the deduced hypothesis). Depending on the ratio of hyperbole to sweat, a better telescope led to relativity - and we're back to induction.
Oh right, (2). What I've learned from these exchanges is that any registry or compendium of ideas (hypotheses, empirical observations, questions) needs to be managed in such a way that it is not construed as some sort of per se selection method for "good science." But I think most of the supporters on this thread have done their best to think about that pitfall. Edwin Locke: when I read your piece on how 400+ studies contributed to goal-setting theory, I thought that a registry could help generate those studies. In your priming example, shouldn't a registry facilitate the process of discovery you described? It just needs to be used the right way.
My reply to Matthew: No I don't think a registry would help develop science at all. Let me be more pointed: discovery is about trying stuff. If you read about research labs in the hard science, most of their ideas don't work. And many things are discovered that were not predicted or expected. Science is a process of discovering the unknown. Here would be my idea for a journal policy: the ms. should focus on question asking: what would happen if.... To be accepted the paper would have to show that the au. knew the RELEVANT literature, IF ANY (no phony deductions) and that the study moved the field forward in some non trivial way. Theory building would be gradual and require many studies. No over generalizations or pretentious claims. The introductions would be short and the discussion would be used to reveal how the study moved science forward and how far along we were in the process of theory building.
Well, I just wish someone would write about *how* a registry would inhibit the development of science. At this point, the skeptical approach seems to be to assert that a registry is bad and then to talk about the progress of science. I read your article, Edwin, and enjoyed it: how induction and deduction are combined in scientific progress. I even went and looked up a book I had as a graduate student (Validity and the Research Process by Brinberg and McGrath; some of the same material). But I see no evidence that (relatively) pro-registry people are anti-induction, especially as the thread has developed.
I am a proponent of short introductions, though.
I reviewed Goal-setting theory for a book and I was astonished about the contradictions and sometimes strong claims and weak evidence. The long-term effects are something I miss too. This exchange has been dominated by this example, but goal setting theory for me is one example of researchers trying to prove their point (confirmation bias) and making strong claims which are not always substantiated by the data. I can understand that if you build your career on a few themes, confirmation bias is always lurking behind the corner. Nuanced conclusions such as "setting stretching performance goals are not the best option for complex jobs" and "often 'do your best goals' work better" are often forgotten or not highlighted in the papas or reviews. But what I found most astonishing is the lack of research on the side effects. Goals definitely work on the short term for low paid (bottom 25%), somewhat boring and physical output jobs - but for other jobs and especially complex jobs this has not been proven at all. But long term effects have not been studied (not longer than one year to my knowledge) and the effects on stress and absenteïsm have never been researched. People also sometimes increase their efforts while at the same time becoming less and less autonomously motivated. The research often meshes up 'motivation' with 'higher effort'. This lack of research on side effects is worrisome: It is as if you develop a drug but do only research on the positive effects on the illness you want to treat, but you totally ignore (and even do not try to look at) side effects. As goals the way they are defined now in IO psychology are definitely not a phenomenon that has been used like this by our ancestors (and have not been subject to natural selection), it is worthwhile investigating whether such 'modern inventions' have serious negative side effects on autonomous motivation, health etc.
I fully support preregistering experiments because I hope this can make psychology (and other sciences) more of a real science and deal with the filedrawer problem which is a plague for all scientific domains. But again, the problem is bigger if researchers continue to dedicate themselves to one limited domain (in Belgium we have a saying that if you only have a hammer, you consider everything to be a nail): I think of psychoanalysis, goal setting, feedback intervention theory, Self-Determination Theory where I typically see many researchers devoting their entire careers to such a limited field of research - or two or three). And one only has to look at the sheer number of magazines and the themes such as psychoanalytical thinking and therapy, the dominant thinking in many fields of social psychology based on the Standard Social Science Model (blank slate, brain is a general learning machine etc.) or Jungian Typologies etc. to understand psychology is especially plagued with crap theories - and this is a very 'unique' position in the sciences. See also Lilienfeld paper's on the subject (Can psychology become a science?).
The soundness of the underlying theory (meta-theory is often lacking!) and the convergence with other scientific domains such as behavioral biology, biology, evolutionary biology and psychology should however be the main concern. Preregistering experiments will not rule out pseudoscientific theories to flourish I am afraid. But it is a necessary step and replication is still very much needed too.
I don't see why exploratory research would be hindered by this. But if you want to build good evidence, exploratory research is a start, and drawing conclusions at that stage is to be avoided at the least. For that we need context of justification.
(1) There is nothing in what I have said that suggests we should rely on the effects of a single study. Nothing. On the other hand, multiple studies by a single lab is apparently no guarantee as has been found recently in the multi-lab study for some priming studies where multiple labs could not exactly replicate the priming studies examined. So, of course multiple labs replicating findings is a good thing....that's the purpose of the multi-lab study just completed...that by the way involved pre-registration of the hypotheses and insuring that the methods used were exactly as had been previously used (and with the cooperation of the original researchers). Without those pre-registration safeguards it would be too easy for the original researchers to say -- oh, wait, you missed x, y, and z that we did differently and maybe that contributed to the effect. So, the goal is to minimize this last possibility (although of course after the fact the researchers could still say that and further test for moderators contributing the the differential effects). And, if researchers are running tons of studies with tons of measures (as is increasingly possible) it becomes easier to just report studies that jive with one pattern, not reporting the rest -- it is clear that that would really jack up alpha and make it more likely that effects would not replicate in a multi-trial study. This is a potential big problem and pre-registration does help address it. We need to specify if studies are really a priori (e.g., through a registry) or not. Chris is right there is no GUARANTEE of exact replication (subsequently by multiple labs) when the original study was pre-registered vs. not, but I bet the probability of exact replication is greater with the former than the latter (an empirical question).
(2) I agree with Matthew that I don't see how pre-registration would inhibit the development of science. Those taking this position should give some examples of studies that you think would not have been published if we had had pre-registration.
(3) There is nothing in what anyone has said to suggest that studies would not be publishable that were not a priori. Although the author needs to be transparent. Although certainly researchers need to be clear about whether the study was a priori or not (that's just transparency). I think the researcher just has to make the case for non-aprior studies and convince us of their likely replicability and value (what are they adding to the field/conversation, etc.). I do think however that the authors need to make a case for the likely replicability of these findings. For example, I spent more than two years analyzing a large data set because I was exploring some interesting possibilities in the data, but I had lots of other measures (so at one point I found a very interesting pattern, but if it's a reasonable conclusion than I knew I'd expect certain other patterns too with other measures within that data set. When I didn't find those patterns, I didn't publish that interesting finding...because I didn't trust it was real. Where interesting ideas were consistent through other testing -- and with other work -- I did think it was interesting and should be published and wrote it up, but treated it as a question. At least so far, I'm grateful that to my knowledge all my prior work has replicated where tested in other labs, but I think part of that is I am generally pretty hard on my findings in terms of putting them to consistency tests (and sending students back to recheck our procedures multiple times to make sure they warrant our claims. I've historically been especially careful when the findings were "too good" and likely due to error instead)....I have assumed other researchers are following a similar self-standard in their work -- but honestly I have no idea.
Duhem-Quine, for anyone who didn't know (or google it, ahem).
http://en.wikipedia.org/wiki/Duhem%E2%80%93Quine_thesis
And a Youtube video, worth it if only because surely we have all considered demonic intercession with our experiments.
http://www.youtube.com/watch?v=-klqI4d_wbY
Mr. Vermeren obviously did not really study goal setting theory because what he says is not true. The two best sources are 2 books: Locke & Latham 1990 and 2013. A short answer to Lynn:yes multiple labs are a good idea but exact replication has very limited value--it is still only one study. What is the generalizatbility? What are the causal mechanisms? What are the moderators? If a study can only survive by doing every single thing exactly the same, it will probably not lead to anything useful. Predicting the result does not make it any more useful than not predicting it. Either way you need replication with variation. And you need to understand how it works. Suppose you have 100 IV's and 10 work (a bit above chance). Whether you predicted this or not, you need to know if this is a reliable finding whose causes are understood.
Matt and Duhem hypothesis: in the fundamental sense I agree with this only moreso but it was not the point of my other posts. In reality, though, every experiment pre supposes a huge context of prior knowledge including knowledge not directly stated in the study--take one example: every study in science assumes there is a real world out there are that you can know it. This is only the tip of the iceberg--knowledge is hierarchical and contextual. See Harriman's book The Logical Leap. This is why progress in science requires a sound epistemological foundation--a very big topic outside the scope of this post.
Chris Crandall referenced the D-Q hypothesis because he thinks your ideas overlap with that hypothesis. I just looked it up. Edwin, I think you should consider what Lynn wrote carefully. It strikes me that, while you are saying that replication without variation isn't very helpful, she is pointing out that replication putatively *without* variation isn't common enough. That is, replication without variation other than a different lab conducting the work. I think this has happened with some of John Bargh's ingenious priming studies.
I don't think anyone seriously disagrees with your point. In fact, I used the need to study variation around a phenomenon as an example of how multiple studies drawn from a set of hypotheses (questions would be as reasonable a term) would be a useful application of a registry - it's one of the earlier posts. People working independently from the same basic propositions can tell the field something about how, when and where x happens. But Lynn has a solid point: if you can't replicate without variation, there's an interpretive drawback to replication with variation.
D-Q and Popper-style approaches both seem to deal with uncertainty, but from different directions. One tells you nothing is assuredly false and the other that nothing is assuredly true. It was unfair contrasting Newton with Descartes, though; Rene didn't put his deductions to the test! (Popper would surely not have approved.)
This discussion is beginning to go stale. What is the scientific essence of this debate? Could it be that certain researchers who are not able to get up, collect data write up, publish and get credit for their work are trying to get on the scientific ladder by simply registering ideas that may never get off the ground? Science has been here as long as man, how is it that today the need to register research ideas creep up when the idea owner has not even collected data to see if the cap fits? "Publish or perish" goes the jargon.
Edwin--I'm sure the idea isn't to do one study because that's all they plan to do... They looked at 13 effects (some of them across multiple studies such as those by Danny Kahneman). The idea is to do a SAMPLING across studies to see the extent of the exact replication problem. They started out with the easier studies to re-run I suspect. But, surely that's just the beginning. I am certain that if it turned out that a sampling of studies (let's say 30% to 50%) in our top journals could not replicate with multi-lab efforts in these early replication attempts that there would be much more in the way of resources devoted to such replication efforts ...and problems with exact replication would indeed be a huge problem from the perspective of the broader scientific community. Yes, it would be great to have a huge matrix of exact replications across many studies (including all the specified conceptual replications) by many labs...but let's give researchers trying to SEE FIRST if there is an exact replication problem a chance to get going.
Regarding, Edwin's statement: "And you need to understand how it works. Suppose you have 100 IV's and 10 work (a bit above chance). Whether you predicted this or not, you need to know if this is a reliable finding whose causes are understood." I assume you mean 10 were significant after alpha correction for all those tests. If you are transparent to the reader about it, and then try to replicate those predictions in subsequent work I don't have any problems with that. That's a bit like cross-validation (looking for patterns in 1/2 of a large data set and then seeing if they replicate in the other 1/2). At that point your predictions are a priori right, so why not list them as such? No one, as Matthew suggests, disagrees with your point that we need to have multiple replications of work or that we need to understand the boundary conditions (and moderators involved) as well as the mediators and causal mechanisms. Of course..It's pretty much the basic graduate school training for most of us.
Akpan-- You ask, "Could it be that certain researchers who are not able to get up, collect data write up, publish and get credit for their work are trying to get on the scientific ladder by simply registering ideas that may never get off the ground?". No, I don't even see how that would work. I actually don't think registration should be public until one goes to publish the work (and links or additional information to claim a priori hypotheses are provided to reviewers). I like the idea of a priori registration for many reasons including because many times I spend a lot of time thinking about my power and how I will analyze my findings prior to collecting data (always have, it was the way I was trained). Often I end up doing a different analysis because a reviewer wants it (often because they may have the idea that only that analysis that the author presents worked and that's why the writer reported it). If pre-registering could reduce the skeptical review (of chosen analysis problem) that would be another plus. At least in theory, those who pre-register should be allowed to use one-tailed tests (that provide a benefit in terms of power). Basically, those who have a priori ideas should derive some benefit from truly having had them in advance of running the study.
Lynn: It is nice that students are taught about mediators and moderators but that is not how the journal system actually works--they want you to make deductions off the bat, then test them and that's it--what they do not support is systematic theory building which can't be done today--they want quickie theory building--and, of course, they also want something totally new all the time which also undermines theory building. Goal setting theory could never have been built today.
Matt: I understand your point but it still strikes me that a result that is so delicate that you need exact replication, by another lab, to get a result is of questionable value. How robust is such a result? I would think that the original investigator should start with replication with variation if they want to be taken seriously, especially if the result seems surprising. But it seems the goal in these cases is not theory building at all but just getting attention.
As to Descartes, my point there was that Descartes used the wrong approach to science: everything deduced from starting assumptions or premises. Yes he could have tested his claims (though he would claim that was an invalid method) but his mistake I think was his starting point. Recall the tremendous opposition Newton had to his experiments with light--people totally opposed his inductive approach to the subject.
It is not that deduction is unimportant--but it presupposes induction. Consider the famous Socrates syllogism--it depends on the fact that all men are mortal. How did the Greeks (among others) discover this? By observation. What did they have to observe? That people could be killed in battle; that people died at all ages from illness; that there was a process of aging with visible manifestations, etc.
BTW: Aristotle founded field of biology using induction based on detailed observations, including dissections, of living organisms. He made some errors and the Catholics turned some of his conclusion into religious dogma but that was not his doing. The point is that his approach, like Newton's was right. Galileo, BTW, gave Aristotle credit for this approach to science.
What would a journal editor today do if you wrote an article based on inductive observations and your introduction ended simply by posing questions? I got away with this once but only once in my career. The rest of the time inductive articles put reviewers in a fury.
BTW: I do not oppose articles with hypotheses, as such--but I just oppose the idea that they have to be deduced from an existing theory (which is most cases is made up).
BTW: this exchange has been interesting. We should set up a big symposium on it.
Edwin-- I don't know what journal system you are talking about (perhaps reviewers in different areas operate with different implicit rules? different journals?), but I think your beliefs about "how things work" might need checking -- at least to see how generalizable they might be.
Certainly that has not been my experience in my 35 years of publishing in (and reading) a range of the top journals in psychology, communication, public health/behavioral medicine including being an associate editor of a personality journal. For example, the latest piece I published (albeit in a medical journal) tested parts of a mediational model we've been developing now as part of our theoretical model, Socially Optimized Learning in Virtual Environments (SOLVE), (stressing the importance of automatic/affective processes affecting decision-making in high-risk populations) since the early 1990's (and that was a stated a priori hypothesis in my current NIMH grant based on interesting earlier exploratory findings in an NIAID grant -- note I didn't publish those earlier because although provocative I wanted to do the follow up test to make sure we'd replicate it as an apriori hypothesis). As I casually look through JPSP recently there are clearly plenty of articles with moderators. A quick look also showed quite a few with mediators. For example, on the mediator front deMelo, Celso, Carnevale, Read, & Gratch (2013) -- a very interdisciplinary team -- conducted a series of 5 experiments including three studies using multiple mediation analyses and a causal-chain design that supports an emerging model with implications for designing algorithms for agent behavior in human-agent interactions (interface of social psychology and computer science).
On another issue-- I think for some journals (JPSP for example) I think you probably couldn't just have a short intro with questions. But for other journals you could have a short introduction to a publishable study today with just questions -- especially in the development of a new field in which we have little existing knowledge needed to further our research and theory. For example, at one point my colleagues and I wondered what features were important to women in HIV prevention products and whether assumptions that were being made about women's interest in products were valid (this was actually formative research for us). We had no to little information about this and it was important to our next research projects and theoretical steps. Anyway, it was a question. Here's the entire introduction below (very short as you can see):
It has been argued that one reason underlying the rapid spread of HIV among women is that, with the possible exception of the female condom, no reliable HIV/sexually transmitted disease (STD) prevention method is available that women can use without their partner’s consent.This has led to a call within the health care community for the development of more “female- controlled” methods of HIV/STD prevention. But this assumes that female control is an important and personally desirable feature for sexually active women and, moreover, that women who are currently not using an HIV/STD prevention method would do so if female-controlled methods were available. The present research attempts to determine the level of interest in female-controlled methods of HIV/STD prevention. (this was published with colleagues at the CDC in the American Journal of Public Health -- a top journal in that area).
Although this was published a while back I think you could publish work like this today if it involved a new topic area that was an important one in a given field.
I do think though that more descriptive work should be publishable than often is.
I have been publishing for 50+ years and things have changed in management and I/O psychology journals--they are now theory fanatics. I am glad you had some good experiences.
A favorite deductive theory model now is mediated-moderation or vice versa. This looks really sophisticated. However, I have never seen one replicated. Such studies are not used to build theories but to look really impressive. And I am sure that a replication would never be accepted.
Dismissing criticism as 'just not true' (I did analyse your paper "A 35-year odyssey" in depth) and claiming to be an authority ('I have been publishing for 50+ years) won't help the debate further. Strange there is such fierce opposition to pre-registring experiments and replication, whereas we are talking mostly about research in the context of justification.
What exactly is indeed the theory on goal-setting: if they work, then why do they work? What can evolutionary psychology tell us about that. Well, evolutionary psychology for example predicts that if a) resources are abundant (e.g. foods high in protein like big mammals) then people will not share food but also use it to increase their status and that b) if you give triggers for (male) competition, then fierce ungroup competition will manifest itself. This has been shown in the San-tribes living in the Kalahari desert. Exactly the same is happening in organizations: tournament theory started from the idea that if you create big gaps in salary as from a certain level (salary x 2 tot 10), then the best would compete for these positions. This is indeed exactly what happened, and this has also been increased by goal setting individual goals. They create internal competition, diminish collaboration and have resulted in more fraud. It is no secret people use money and position to increase their status (which is an old mechanism to increase one's fitness). Goal theory did also not predict that the 'scientific recommendations' would not be followed. Contrary to your suggestions, stretching goals are set unilaterally (top down) than ever before, resulting in well documented work related stress (e.g. a large European study, see for an example De Backer et al., 2000) and the psychosomatic diseases as a result from it.
SDT researchers Vansteenkiste, Ryan and Deci accused Locke and Latham of drawing too many ideological conclusions based on the belief of individualistic cultures that capitalism leads to freedom. Yet research on American employees shows that they feel anything but free in their jobs and in their lives (Ehrenreich, 2001). Performance goals remind people they are under the control of a boss. “Even many high-level workers also feel controlled by their work situations and they pay significant costs in terms of their performance and well-being” (Vansteenkiste et al., 2001).
In 2009, four prominent professors from equally renowned Business Schools55 admitted that for many years, they — along with many companies — had blindly followed the advice of academics who praised the positive effects of working with goals (Ordóñez et al, 2009). Like the SDT researchers, the professors now warn of very negative effects associated with goal setting such as:
• diminished employee attention to more important organisational issues due to an exaggerated focus on narrow, short-term goals;
• a rise in risk behaviour and unethical behaviour;
• inhibited learning;
• corrosion of the organisational culture; and
• reduced intrinsic motivation (!).
So I find the lack of a meta-theory in many fields - and in goal-setting theory too - very disturbing, and that is why so many side effects (which could have been predicted) are not taken into account. I am sure you meant well by spending so much time on goal-setting research and you came up with recommendations such as 'goals only world well when people feel committed to them', but the ideal circumstances you describe are far from reality: in reality humans are, just like other species torn between (ungroup) collaboration and internal (inter-sex) competition. Individual goals stimulate internal competition (which is detrimental to organizations) and for sure, those in power use goals to control their 'subordinates'. These side effects would have been predicted by a sound meta-theory (which evolutionary psychology could have provided).
Preregistering experiments could also stimulate selecting more elegant experiments that could result in (temporary) confirmation or falsification - but you also need a good sound meta-theory for this.
Maybe this is also a good read:
http://popsych.org/i-find-your-lack-of-theory-and-replications-disturbing/
Edwin - I think there is actually so much common ground on everyone's vision of the relevance of induction and deduction in theory-generating and testing that our nationally-televised symposium would end up arguing about whether DesCartes is the right choice to use to illustrate the pitfalls of deduction. I do think there is a long-standing fear in social psychology that the grand work of the 1960s and 1970s (heavily inductive) has been replaced by smaller-scale incremental and largely deductive work - the interlude between the excitement of paradigm shifts, to borrow from Kuhn. (My precession of Mercury example might come from Kuhn, now that I think about it.)
As practical advice, if social psychology journals are sere in this respect, put a public health twist on your work and write to a public health journal (per Lynn's AJPH example). Those journals are generally receptive to induction, and, in fact, there is a side industry in chin-stroking over whether epidemiology is too focused on induction.
Matt: in i/o psychology and management there is no common ground--it is all h's from deduction and inductive studies are meet with virtual outrage.
So I guess other fields differ.
I think Kuhn was a student of Popper--not my cup of tea.
Kuhn came to prominence a little after Popper, I think. Kuhn's famous book, The Structure of Scientific Revolutions, was first published in 1962. Kuhn was a PhD physicist-turned-philosopher in the US, however, so not likely all that closely connected with Popper. In fact, I think Kuhn was quite disparaging of social sciences in general. Having just read a summary of Popper, I'd say he was quite the revolutionary, but, like so many revolutionaries, prone to exaggerate the point.
The two are certainly conceptually connected in that Kuhn envisions "paradigm shift" through the accumulation of data that falsifies existing theory - this is primarily a hypothesis-driven deductive exercise. But new paradigms are generally induced from those same observations in practice, so I doubt Kuhn was hostile to induction. and there are plenty of other examples of induction in social science - maybe this is something *psychology* really needs to take into account.
So here's a proposition worth exploring: (a) do interdisciplinary groups create more knowledge than mono-disciplinary groups and (b) do they do so through greater use of induction? For extra credit, what does this say about the traditional organization of academic departments?
NIce question Matthew. Why don't you pose it as a separate question?
This is my last reply to Patrick. There is no way I can summarize 1000 studies in a short post--you have obviously not read our two books which, among other things, identify the mediators of goals. We also identify the possible pitfalls of goal and how to avoid them, and scores of other results. As to Ordonez, there was a 2+2 exchange between them and me and Latham in Academy of Management Perspectives in 2009. We show their accusations to be arbitrary.. You have obviously not read that work either. The reason I am an expert is because I have been doing goal studies and studying the literature for over 45 years. Your attacks are based on ignorance--you are not a serious scholar in this realm. That is why I won't exchange any more posts with you.
Patrick Vermeren--there is no reasonable distinction between the "context of discovery" and the "context of justification." That's been discarded by philosophers of science--psychologists just haven't got the message yet; but they will. They will.
I want to to repeat Matthew Hogben's remark, because I think it's deeply important:
"I do think there is a long-standing fear in social psychology that the grand work
of the 1960s and 1970s (heavily inductive) has been replaced by smaller-scale
incremental and largely deductive work."
And you may have added that social psychologists have, self-destructively, set up a world that rewards this smallness, which is exacerbated by the desire to require preregistration.
Scientific progress is not deductive. I wish more social psychologists understood this.
The distinction that the "reformers" make is between inductive and deductive science, and they strongly prefer the latter (hence preregistration). They then map these onto context of discovery (inductive) and context of justification (deductive), and privilege the "formal" logic of deduction. This distinction was proposed by Hans Reichenbach in 1936, and disposed of in 1962 by Thomas Kuhn.
Here's a piece from the Stanford Encyclopedia of Philosophy (a first-rate source): http://plato.stanford.edu/entries/thomas-kuhn/
"Kuhn's claim that scientists do not employ rules in reaching their decisions appeared tantamount to the claim that science is irrational. This was highlighted by his rejection of the distinction between discovery and justification (denying that we can distinguish between the psychological process of thinking up an idea and the logical process of justifying its claim to truth)."
Hey Lynn: I'm not sure what you're asking about, philosophy of science, social psychology, Duhem-Quine, etc.
Just as a general aside, the Duhem-Quine hypothesis is that falsification is logically impossible. Quine made the argument that empirical tests are always "in context," that theories are complex and can be ever-so-slightly adjusted to withstand falsifying tests, and that the "killer test" cannot really be constructed.
Duhem, a French physicist (d. 1916) made the observation that every theoretical test involves the hypothesis AND its operationalization. And so, logically, we test the *conjunction* of a hypothesis and its operationalization, and when the test "fails" we cannot logically determine whether it was the hypothesis that was wrong or if the operationalization was ineffective. Duhem is a quiet killer of bares-bones Popperian falsification. Dead as a doornail.
There is a proper response to this, and it's largely contained within Lakatos. To wit, I don't think any scientist should be making prescriptions about good scientific behavior without a working knowledge of the Lakatos-Feyerabend debates.
One starting point: http://www.amazon.com/For-Against-Method-Lakatos-Feyerabend-Correspondence/dp/0226467759
All scientists should know that this writing is accessible, and for me at least, brutally funny. For more fun, there is always "Against Method" by Feyerabend, which concludes, correctly I think, that the Catholic Church had a pretty good point about Galileo's claims.
Another good source: http://www.theguardian.com/commentisfree/belief/2012/oct/01/karl-popper-lakatos-kuhn-feyerabend
A quote, "A central difficulty of falsification is behavioural rather than theoretical – falsificationism is an ideal. Scientists do not, in practice, jettison theories in response to a single falsificatory instance."
Hi Chris: I was asking for the source for your comment, "The distinction that the "reformers" make is between inductive and deductive science, and they strongly prefer the latter (hence preregistration). They then map these onto context of discovery (inductive) and context of justification (deductive), and privilege the "formal" logic of deduction. "
Lynn: That's just my observation. Pre-registration's most common justification is to separate out "predicted" findings as compared to "discovered" findings.
In the discovery/justification distinction, this is the difference between inductive and deductive reasoning. Predicted findings come from hypotheses derived from theories (deductive).
Most of the hay being made is to ensure stronger inference, to punish people presenting "findings" as if they were "expectations." Preregistration has no value for Type II error, it only is designed to reduce Type I error.
Michael, I don't know that the purpose of a registry of hypotheses - and I think I would now prefer a broader term like propositions - was ever intended explicitly to cut down on Type I error at the expense of Type II. I think, however, Chris Crandall has made a solid logical case that this is a danger, albeit specifically to social psychology. That is, the premise of privileged deductive research is not really replicated (ooh, see what I did there) in other disciplines. The debate migrated to the relative roles of inductive and deductive reasoning in science. I do not think there is a great deal of variance in opinion about the underlying relevance of both in an integrated scheme. In fact, I don't really think there are overblown examples in this chain of comments - maybe a couple of instances where one of us has set up a bit of a straw man to contrast with our own reasoning excellence.
But to return to social and I/O (Edwin Locke's field) psychology. There is a danger of intellectual atrophy in the disciplines if induction is under-valued. That is, the generative value of induction is reduced in scope and therefore the valu of subsequent deductive work is also reduced. Hypotheses draw from a diminishing field of ideas - reduced validity from inadequate domain-sampling, writ large. I am not connected enough with the field to comment knowledgeably about the present scope of affairs, but I do recall I had to send my inductively-based and non-experimental PhD work to an interdisciplinary journal.
There are examples of this in sexual health and STD prevention work (mine); less a disciplinary failing and more a legacy issue around public health "traditional" practice - or traditional public opinion, for that matter. Public health and medical research does tend to "privilege" the RCT, but I estimate those trials are based on an adequate intake of inductive work. And there seems to be more room for editorials to accompany research in many areas of public health publishing; these are avenues for generalizing from the specific to the general - at least, when I write them. That's where the fun is.
Contemporary philosophers don't consider the views Kuhn, Popper, Duhem or the like to reflect actual thinking in philosophy. A very interesting book on the actual status is 'the philosophy of psdeudoscience' (editors Boudry and Pugliucci).
Reading this will point out that falsification is not abandoned at all, but that it is still one of the criteria to judge whether the evidence presented is the result of good, bad or pseudoscience. The book nicely shows that there is a limit to adapting one's theory each time a hypothesis is falsified. There is nothing against refining the theory (most theories are not abandoned after one falsification) - but at some point you need to stop and admit there is too much falsification and thus it is more reasonable to abandon the theoretical framework. Bur confirmation bias is lurking behind the corner, and some scientists are very creative in finding a host of auxiliary hypotheses to protect their (lifetime) research against falsification (see page 87).
Patrick: Contemporary philosophers have rejected Quine? Or David Hull? Or Peter Railton? Or Phillip Kitcher? These are "the like" of which you speak, and I think that you're simply wrong that contemporary philosophers embrace falsificationism.
Indeed, your own post describes the problem with pure falsificationism--scientists adapt, change, refine, and doubt. One needs a more complex understanding. Don't know the book you're referring to.
But here is the unmistakeable conclusion from 150 years of philosophy of science: The process is inductive, not deductive. And that is the lesson that the reformers have not learned. Yet.
Chris: I disagree. The book I am referring to is just out since november. Pugliucci, Dennet, Cioffi, and dare I say new promising philosophers like Maarten Boudry are very active on the demarcation of science, bad science, pseudoscience etc. They write that rejecting falsification is a bridge to far - e.g. Boudry writes that still some restrictions 'on the amount of gerrymandering that can be allowed in the face of apparent refutations (quoting Leplin, 1975). Of course, the ruthless rejection is abandoned since Duhem (1954) pointed out that hypotheses and theories stand within their context.
Or take the letter written by 72 Nobel-Prize laureates who provided a curiae brief in the case against creationism (concealed as 'intelligent design': "Science is devoted to formulating and testing naturalistic explanations for natural phenomena. It is a process for systematically collecting and recording data about the physical world, then categorizing and studying the collected data in an effort to infer the principles of nature that best explain the observed phenomena." They explain the following steps:
Collecting data and facts (discovery if you ask me)
Rigourous, methodological testing of principles that might present a naturalistic explanation
Form testable hypotheses based on these facts
Develop a theory to interpret these facts
Tentative conclusions (forever subject to reexamination)
Naturalistic Explanations.
Pigliucci categorizes most findings in psychology as having a lot of empirical data, but often missing the theory. I concur. What do we know if we only observe that if you throw a ball up in the air, it falls down. Then hypotheses were launched and tested and gravity is the theory... The Nobel prize laureates wrote that facts cannot be interpreted without a theory.
You are misrepresenting the status ("unmistakable conclusion"): philosophers are still very much debating both issues (e.g. okasha, 2011, Pigliucci, 2013).
In the same new book I referred to, Cleland and Brindell write that hypotheses can come from observations, dreams, lucky straws etc. "But however the hypothesis arises, what is important is that the scientist be able to deduce a test implication (in essence, a conditional prediction), which forms the basis for a search for confirming or disconfirming empirical evidence, whether in the lab or in the field." (p. 186) They further describe the problem of induction as follows: "one cannot justify a claim about unexamined instances by inference from examined ones, and one cannot infer universal laws of nature from the study of a limited number of instances."
I have no problem with the phase of discovery, on the contrary, but don't represent it as producing reliable evidence. We then need the context of justification and for that context I see no reason to oppose to preregistering experiments and hypotheses. Just as the deductivist should not fear novel ideas, the inductivist should not fear justification or falsification.
Darwin was inductive on thinking about evolutionary forces, but he then painstakingly collected data to test and retest his hypotheses that lead to his theory.
Don't have hypotheses in the first place--just show that you found something new.
ok, there is the hypothesis that justifies the study (and the grant money to run it), and the hypothesis that helps introducing the theoretical relevance of study after its completed. First, does it really matter if the two do not match? Second, the first one is available in the grant proposals and as speculations in older paper. The later is in the introduction of recent papers.
Let's not add bureaucratie with preregistration...