Experimental is another word to describe (prospective) randomized controlled trials. The main ingredients of an experimental condition will always be randomization and obviously then, a control group(s) with the exact same probability of receiving the intervention as receiving the control condition.
Quasi-experiments are also called non-randomized studies, observational studies, etc. Here, the main ingredient is that (a) the study is almost always performed retrospectively, and (b) you can adjust the data to "mimic" a randomized trial (using observed data only). The most popular approach is matching, where a control group is found among the non-treated population who have the same observed baseline characteristics as the treated group. Therefore, the groups are comparable, and thus outcomes may be "assumed" unbiased (we assume unbiasness because we never can control for unmeasured variables, which may confound the relationship between the treatment and outcomes)...
Experimental is another word to describe (prospective) randomized controlled trials. The main ingredients of an experimental condition will always be randomization and obviously then, a control group(s) with the exact same probability of receiving the intervention as receiving the control condition.
Quasi-experiments are also called non-randomized studies, observational studies, etc. Here, the main ingredient is that (a) the study is almost always performed retrospectively, and (b) you can adjust the data to "mimic" a randomized trial (using observed data only). The most popular approach is matching, where a control group is found among the non-treated population who have the same observed baseline characteristics as the treated group. Therefore, the groups are comparable, and thus outcomes may be "assumed" unbiased (we assume unbiasness because we never can control for unmeasured variables, which may confound the relationship between the treatment and outcomes)...
Ahmad, what's the research context in which you work? In psychology and maybe most of social science, "experimental" implies a randomized experiment. Sometimes those have true control groups, but sometimes they don't (just comparison groups). I agree with almost everything Ariel said (Go Blue!), but would add that I don't know that quasi-experimental has a clear and universal definition. For example, I always thought that if you diverged too much from a true experiment (control group, individual randomization) that you were in quasi-experimental land. For example, if you randomize classrooms instead of students (but your unit of analysis is student) you still have an experiment but it's less than pure due to the clustering within classroom. I always thought this would be quasi-experimental. I definitely wouldn't include surveys or other "observational" studies without any experiment in them as even quasi-experimental. Remember that there's category in the universe of all studies, and that's "non-experimental."
The categories you forwarded are in a lot of intro text books. I think they come from the Campbell and Stanley (196x) book that uses them in the title (I don't have it handy, and am on mobile). If anyone knows the roots of this dichotomy, I'm interested to learn. I've never found the quasi-experimental category very useful in practice. Also, I find it more helpful to think of the features of a given study than what "type" it is, as many of those type descriptions are limited. So I'd ask "does the study have any kind of randomized experiment to it," and if so "what was the randomization?" Keep in mind that a study can be both a sample survey AND an experiment. I hope that helps.
An experiment is so because it contains both experimental group and control group.
True experimental design is a design that involves the manipulation of the independent variable and comparison of groups in randomized assignment.
Quasi-experimental designs are designs that involve manipulation of the independent variable and may include comparison of groups but are absent in randomized assignment of subjects to conditions.
The similarity between a true experiment and a quasi-experiment is that both of them contain an experimental group and a control group.
The difference in that a true experiment has probability samples and a quasi-experiment involves a non-probability sample.
The word experiment is a broader term because it could include a pre-experiment, (which is not actually an experiment), a quasi-experiment and a true experiment.
Ahmad and Eddie, I just wanted to add some clarification because I've found the word "randomization" is often confused by people learning this terminology. There are (at least) two ways "randomization" applies to study design. One is in how we select the sample (random or non-random). Interestingly, when we do traditional data analysis, we often assume the sample was a random one even if it wasn't because it makes the analysis easier. In cases where we actually know that it was random and have information about the probabilities of selection (i.e., a "probability sample"), then we include that information in the estimation. That's often called "complex sample/survey data analysis." Random samples that are weighted and adjusted properly will get you accurate (or nearly accurate) estimates of population parameters and confidence intervals.
The other place randomization comes in is in the assignment of participants to conditions in experimental studies. Note that sampling and assignment are two different steps and they could both be random, non-random, or you could have one be random and the other non-random. The way science and research works, some fields (e.g., psychology) put a premium on random assignment and worry much less, if at all, about the sample being random. Other fields like political science and public health (or at least the areas of them I know best) are more concerned with the sample being random, and include little or no random assignment/experimentation. I'm sure that's an overstatement, but you get the idea. Randomized experiments conducted on random samples are rare, but they exist. For example, in my line of work I conduct experiments on survey methods. If we do those "in the field" on a live survey, then we have a randomized experiment nested within a random sample. But there are logistical and ethical reasons why these are rarely done for "substantive topics" (e.g., the effects of an educational program or drug). It would be neat if others could share some instances.
I just want to make one correction to Eddie's post in case it hasn't become clear. Eddie said "The difference in that a true experiment has probability samples and a quasi-experiment involves a non-probability sample." I don't think that's a good use of the term "experiment" (or "probability sample") because it confuses random assignment with sampling. Most (I'd argue 99%) of your truly-experimental research (i.e., control group, treatment group, random assignment) done in psychology and other science labs, or in contexts with real people (e.g., kids in schools) are done with non-probability samples (or "psuedo-probability samples", e.g., the schools are chosen based on connections between researchers and administrators, but the classrooms are chosen randomly). I hope I didn't misunderstand what you were saying Eddie.
Thanks for the good discussion. Ahmad, does this help you with your research? What is that research by the way? I hope you're not a student looking for text copy into a paper ;)
I agree with the aforementioned comments, but respectfully disagree with the idea that you cannot apply inferential statistics to a quasi-experimental design. Education is a field in which you commonly cannot physically do a randomized controlled trial (RCT), and rely heavily on quasi-experimental design. Imagine using two different classrooms for an experiment at an elementary school. So long as other components are equivalent, the groups are comparable, albeit with a grain of salt knowing that it's a quasi-experimental design. Take the Head Start program (Cicirelli 1969). There are definitely threats to validity, but people also definitely publish inferences from quasi-experimental designs. If your library has it, I highly recommend Green JL, Camilli G and Elmore Patricia. Handbook of Complementary Methods in Educational Research. 2006. Routledge. Chapter 32 is on quasi-experimental design.
(By the way, if you're looking for the original reference to the term quasi-experimental see Campbell and Stanley 1963.)
Thank you Ahmad for bringing such a point of discussion.
The similarity between experimental and quasi-experimental studies is in that in both cases there is manipulation of the environment. Quasi-experimental studies lack one of the qualities that RCTs have. Either randomization or control group is missing in quasi-experimental study designs.
Great comment, Andrew. I didn't mean to take such a strong stance, and didn't want to get into the related topics of Bayesian inference, propensity score matching, etc. I certainly didn't mean to imply that surveys are error free either. If they were I'd be out of a job! Between random sample (which has sampling error in it by default) and analyzable sample, there's nonresponse (and potential nonresponse error). This is probably the largest thing we worry about (though sampling error is the most well-developed area of our field). If you ignore it, you may have bias in all sorts of statistics. If you adjust for it, you may induce error if you don't account for the right things (i.e., the nonresponse mechanism specific to the statistic of interest). I essentially agree with what you've posted.
This is why statistics is fun, right :) I'm always reminded of the George Box quote "All models are wrong. Some are useful." Seems to classify much of my work and research I read or review.
The question I always try to ask when reviewing or designing studies is "what inference is trying to be made?" Is it an inference about an effect (in which case random assignment and true control group is the gold standard)? Or is it an inference about a "descriptive" population parameter (like a mean, total, proportion, etc.), in which the random sample is the gold standard. Obviously, these two things meet in the middle somewhere, and even so-called "analytic statistics" (e.g., regression coefficients) are prone the same potential errors as univariates and bivariates if the sample is non-probability (or probability with severe nonresponse error). Given the two distinct statistical traditions (psychological/experimental and survey/sampling/population), it's no surprise that we see a major focus on models in one arm of social science and major focus on samples in another, and the overlap is slim (it's there, but slim, at least in the circles I'm in, which admittedly involve "old fashioned" government statistics). Sharon Lohr comes to mind as someone who works in the middle ground.
There are a couple of task force reports that the folks following this thread might find interesting (both by the American Association for Public Opinion Research, AAPOR). There's some interesting discussion of inference of experimental effects from non-probability sampling.
I tried to scan responses before I throw in my 2 cents, and I have not seen mentioned one important aspect of the difference. Apologies if I missed something. So here goes: experiments (where one can manipulate a variable of interest and keep everything else the same) allow us to to claim causality: any observed change in outcomes can be attributed to the variable we manipulated. Anything else limits us to claims of association, which can also be useful, but do not allow us to act as if causality has been established.
You can claim causality in non-randomized studies - that is why it's called causal inference. However, you can never be certain that you've controlled for all sources of confounding (unobserved or unmeasured). If we couldn't make the case that the intervention is causal, then the entire field of non-experimental research would disappear.
I beg to disagree - I think that finding associations is vary valuable. Besides, when research results are communicated, people often lose sight of any caveats (such as that we cannot be sure of causality) which can be very misleading and even end up as support to decisions that have negative consequences or fail to attain the positive ones. The fact that people claim causality anyway even if they shouldn't is no argument that such a practice is correct ... ;-)
I love the lively discussion! May I offer that there is potential to accept causality from a quasi-experimental design, but that it really comes with a grain of salt and the reader's interpretation of other potential confounders being controlled for. I did a quick lit search because I've always learned that you CAN draw causality from QE design, but clearly there are others to have learned otherwise. Here's what I came up with. I hope they're helpful to folks.
If you read only one, read: https://depts.washington.edu/methods/readings/Shadish.pdf
Clin Infect Dis. (2004) 38 (11): 1586-1591.
West SG, Biesanz JC, Pitts SC. Causal Inference and Generalization in Field Settings: Experimental and Quasi-Experimental Designs. Ch 3 in Handbook of Research Methods in Social and Personality Psychology. 2000
Sandra, I am not sure exactly what point you're disagreeing with? If your argument is that you can't draw causal inferences from non-experimental studies, then I can give you a simple example to disprove your position:
Do you agree or disagree with this statement: "smoking causes lung cancer"
If you agree, then you're basing your statement on the observational studies that inferred that cigarettes cause smoking, since there is absolutely no prospective experimental study that examined this issue (nor would any institutional review board/human subjects committee approve a study in which individuals would purposely be exposed to a harmful product).
If you disagree, then you are making the same argument used by the tobacco firms that have argued in court (unsuccessfully), that the studies are observational, and thus cannot prove a causal relationship between smoking tobacco and lung cancer.
If you don't like that example, then perhaps you'd like to discuss asbestos or myriad other exposures that have been deemed to cause disease, based on observational studies alone?
I would like to add that quasi experiments in the form of experiments of opportunity or natural experiments can be used to tackle really big questions such as 'did colonialism" hold back development?" that would never be possible with an intervention-based randomized trial
Natural Experiments of History 2011
by Jared Diamond (Author), James A. Robinson
Natural Experiments in the Social Sciences: A Design-Based Approach Thad Dunning
and one part of Snow's classic work on Cholera was of this form when he exploited the opportunity when the intake of one water company had moved upstream
To my mind he produced very strong evidence that contaminated water was causing the infection long before the Cholera vibrio had been identified.,
See Chapter 4 of https://www.researchgate.net/publication/236671039_Epidemiology_An_Introduction
Excellent addition Kelvyn! Snow's work is the epitome of causal inference from observational data.
If we discarded every study that made causal inferences from observational data, entire disciplines would disappear (think of economics, political science, epidemiology, sociology, etc. that rely nearly 100% on observational data).
A final note. To strengthen the ability to draw causal inferences from both experimental and non-experimental data, researchers are encouraged to design their studies to investigate causal mechanisms. I am including a link to a paper I recently wrote about this exact issue.
Article Using mediation analysis to identify causal mechanisms in di...
Natural experiments are different - it's perfectly fine to see houses crumbling after an earthquake and infer causality. For the rest, it may be safe perhaps in many cases but unfortunately people extend the practice to "unsafe" situations. In the case of tobacco, I would bet (but don't know) that the cancer-causing claim was bolstered with some lab experiments on animals (and no, I would not want to take it on either way).
Some of these arguments seem to claim that validity is "in the eye of the beholder," but the whole point of Campbell & Stanley's argument about threats to validity is push the need to specify alternative explanations (history, selection, etc.) in order to challenge a claim.
Also, I would say that quasi-emperimental design has come along way since Campbell & Stanley, with quite a range of interesting alternatives now available:
Shadish; Cook; Cambell (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin.
Indeed, brilliant minds such as Donald Rubin, Paul Rosenbaum, Alberto Abadie, Guido Imbens, Gary King, James Robins, James Heckman (to name but a few) have devoted their careers to developing causal inferential techniques for non-randomized data.
In epidemiology we have true experiments (cardinal feature: random assignment to E and C) and observational studies (all non-randomized controlled designs). As has been mentioned, the term 'quasi-experiment' comes from psychology, where successfully manipulating an exposure and observing a predicted correlated outcome has always been an important part of causal inference, particularly in the absence of true randomization. Why is it a stronger design when the investigator deliberately determines the 'dose'? One, it removes self-selection bias. Two, it minimizes errors due to self-reported exposure measurement (e.g. over-reporting your exercise) that can occur in pure observational studies. IMO, if you have a strong theory, a decent effect size, and a fairly simple causal model (i.e. not hugely multifactorial and beset by dozens of confounders, or with decent laboratory control over major nuisance factors), it does make sense to talk about a separate class of research designs in between true experiments and observational studies. The problem is that they will never be as strong on average as true experiments, in which randomization distributes both known and unknown confounders by chance. So in most population health applications, I would consider a 'quasi-experiment' a form of cohort design in which the measurement of one of many exposures of potential interest may be more accurate than typical self-report, and in which self-selection on that one exposure is not a major source of bias (although not everyone assigned would comply, so there will still be some self-selection to exposure). But epi studies tend to use larger samples than social psychology, so in terms of critical appraisal, I would ask why anyone would go to the same basic trouble and expense of deliberately assigning all those exposures, and not go the extra step and use randomization to create the groups?
The answer to "can QEs support claims of causality" is "yes -- under certain assumptions". What assumptions? The assumptions that underly the model. In the case of a matched comparison group design, the main assumption is that the E and C groups don't differ in any unmatched characteristics that affect the outcome of interest. In a comparative interrupted time series model, the main assumption is that the only variables (measured or unmeasured) that affect the outcome on which the two groups differ are invariant over time, so that past differences in trends can be extrapolated into the future. Every QE design rests on some assumption, and as long as the real world satisfies that assumption, it can be treated as demonstrating causality. The problem is, these assumptions are often quite nonintuitive and can never be tested. (If we could test them, we wouldn't need to make the assumption!) So to that extent, the validity of QEs truly is "in the eye of the beholder" -- do you believe the underlying assumption or not? Whereas with a properly implemented randomized experiment, no assumptions are needed -- you only need statistics to deal with random sampling error.
And, yes, there are cases where you can't randomize and have to rely on QE estimates. E.g., you can't randomize people to start smoking (although you CAN randomize people to a smoking cessation program and see whether it reduces cancer, though you would have to wait a long time). In those cases, you have to do your best to figure out what assumption(s) you are making and decide, given the risks of being wrong, whether you believe them or not.
Incidentally, I would define "quasi-experiment" to mean an estimation procedure that attempts to mimic an experiment. So I would say that the defining characteristic of a QE is the comparison of multiple groups with different exposure to treatment (either a treatment and comparison group or multiple treatment groups).
Finally, if Ariel Linden is still following this exchange -- you may be interested to know that I use your 2010 article on interrupted time series in my Program Evaluation class; it's the best thing I've found on that topic.
I would like to thank all of you for this active participation in this discussion. I have learned many things from it.
For Matt Jans, my field is education and I'm new in this field.
I'm asking this question because it relates to educational research and I wonder whether true experimental design can be applied to those kind of research.
By the way, is quasi-experimental research relates to Auguste Comte's theory of social physics or sociology?
a key study in the smoking - cancer debate was the experiment of opportunity afforded by the Doll and Hill study of general practitioners as many of these primary care givers gave up smoking. This of course was not a randomized intervention but the cancer rates fell with the number of years that they have given up.
Thus even a 60 year old cigarette smoker could gain at least three years of life expectancy by stopping. These effects were clearly big and convincing.
Let me reframe. I think we're talking theory (or I am) vs. sound practice.
I believe I am right in theory - no causation outside experiments. However, in practice it is more than reasonable to make causal inferences in certain situations (such as when evidence accumulates, raising high the likelihood of a causal link even when the evidence has been gathered through sound quasi-experimental design). Then we are still not certain but quite confident, and we can make decisions based on such evidence with a high likelihood of success. In fact, it would be rather foolish to ignore such evidence.
The reason it is important to say this is that we have no assurance that the experimental work is sound; nevertheless, people claim causality based on much scanter and less reliable evidence than that which links smoking to cancer.
1. Suppose two randomized groups, Pill A and Pill B, differ (p < 0.01) on their average change in blood pressure. Most methodologists agree that this is the best available evidence (from a single study) that Pill A causes more BP change than Pill B on average, even though: i) there is still a 1% chance of Type I error, and ii) some people in the Pill A group will not have a blood pressure reduction, and some people in Pill B group will, meaning we often cannot 'prove' causation at the individual level even with a randomized design.
2. Suppose you hit your thumb hard with a hammer, and you experience sudden throbbing thumb pain. Even in this crummy little unblinded n of 1 study, is there any doubt in your mind that the hammer hit was the cause of your thumb pain? But wait! We're scientists - no controls??!! Fortunately, you've got your unhammered thumb as your control, and it is pain-free. But wait! You didn't randomize the thumb you hit and the thumb you didn't. So, after your thumb recovers, you do a proper experiment by flipping a coin (heads = left, tails = right) and this time you hit your randomly assigned thumb and experience pain, while your control thumb remains pain-free. Please explain i) exactly how the randomization in the proper randomized experiment yields a stronger causal inference that the non-randomized controlled observation, and then explain ii) how the presence of a pain-free control thumb in the non-randomized study increases your original causal certainty, based on just one thumb, that hitting your thumb with a hammer causes sudden intense thumb pain. Or to put it in pragmatic terms: at what point in this series of observations/experiments would you be sufficiently convinced of causation to make you try really hard to not hit your thumb again with a hammer?
Way funny, but we did cover the first case, which is like the smoking story. The second is ... hmm, I don't know like what, other than walking off a roof and finding it was not a good idea (not quite in the same category as what I thought we were discussing). We can often conclude with a high degree of confidence that something causes something else, but it remains a likelihood - which should not prevent us from using the information when soundly produced. But I was not thinking of clear-cut examples. I am thinking of a host of published articles that claim way much more than what is warranted. So I'd say that research quality and careful reporting are important (and not always what they should be).
Agreed, Sanda, glad to find a like-minded person. II'm on a mission to debunk the overly simplistic view that randomization (experiments) 'prove' causation and non-experiments don't. It's true that experiments are better, in theory, on average, but research design is but one of many many factors. And in many cases, the causes we want most to prove (will Aunt Gladys live longer if she gets the surgery or takes the medicine?) we will never know with certainty.
In a true experiment, participants are randomly assigned to either the treatment or the control group, whereas they are not assigned randomly in a quasi-experiment.
In a quasi-experiment, the control and treatment groups differ not only in terms of the experimental treatment they receive, but also in other, often unknown or unknowable, ways. Thus, the researcher must try to statistically control for as many of these differences as possible
Because control is lacking in quasi-experiments, there may be several "rival hypotheses" competing with the experimental manipulation as explanations for observed results
In experimental study, the participants in both the treatment & control groups are randomly assigned. Quasi-experimental research designs do not randomly assign participants to treatment or control groups for comparison.
Ii love these "demonstrations " that you don't need random assignment to prove causation--though i like the oft-cited parachute example better than the hammer story. Both were covered in my original post in this thread. They are non-X evaluations where you believe the assumption on which the analysis is based. The hammer story is an interrupted time series. Your thumb has been feeling fine for days, then you hit it and pain begins immediately. Your assumption is that if you hadn't hit it all the factors that caused it to feel fine would have continued to hold. As i said initially, non-X methods can prove causation if the assumptions on which they are based are correct.
In a true experiment, participants are randomly assigned to either the treatment or the control group, whereas they are not assigned randomly in a quasi-experiment. ... Quasi-experimental research designs do not randomly assign participants to treatment or control groups for comparison.
The definitions given in this thread are standard in the applied textbooks, but I think they grant randomized designs undeserved epistemic privilege, if we look at this from normative decision theory, which is how mathematical statistics is derived from first principles.
The risk of this undeserving epistemic privilege is that we overlook relevant evidence, simply because it does not come from an RCT.
Example: where would you place dynamic allocation/minimization controlled trials? According to the definitions above, they are "quasi-experimental" because they do not randomize. But they do a priori, minimize difference in treated vs. control groups on known prognostic factors, and are considered equivalent to RCT by regulatory authorities.
Citations:
CONSORT Statement (2010): Explanation and Elaboration
http://www.consort-statement.org/
Quote: Nevertheless, in general, trials that use minimization are considered methodologically equivalent to randomized trials, even when a random element is not incorporated. [See Box 2].
Article Statistical issues in the use of dynamic allocation methods ...
Article Treatment allocation by minimisation
From a decision theory POV, whether to randomize or dynamically allocate depends on the purpose of the experiment and the size of your budget.
Article Understanding and misunderstanding randomized controlled trials
Quote: We argue that any special status for RCTs is unwarranted. Which method is most likely to yield a good causal inference depends on what we are trying to discover as well as on what is already known. When little prior knowledge is available, no method is likely to yield well-supported conclusions. This paper is not a criticism of RCTs in and of themselves, nor does it propose any hierarchy of evidence, nor attempt to identify good and bad studies. Instead, we will argue that, depending on what we want to discover, why we want to discover it, and what we already know, there will often be superior routes of investigation and, for a great many questions where RCTs can help, a great deal of other work—empirical, theoretical, and conceptual—needs to be done to make the results of an RCT serviceable.
A Theory of Experimenters
https://www.nber.org/papers/w23867
Quote: This paper proposes a decision-theoretic framework for experiment design. We model experimenters as ambiguity-averse decision-makers, who make trade-offs between subjective expected performance and robustness. This framework accounts for experimenters' preference for randomization, and clarifies the circumstances in which randomization is optimal: when the available sample size is large enough or robustness is an important concern.
Other scholars (along with myself) have discussed this issue on the Data Methods discussion boards. The relevant threads are here: