I’m in Anchorage, preparing for the World Congress of Epidemiology. One of the sessions I’m speaking at is a consultation for the next edition of the Dictionary of Epidemiology. It’s a strange and delightful document, this Dictionary: since it sets out to define not only individual words but also the discipline of epidemiology as a whole. Thus it contains both mundane and metaphysics entries, from “death certificate” to “causality”. I’m billed to talk about “Defining Measures of Causal Strength”. There’s a lot to say: the current entries under causal-related terms could use some disciplining. But I’m particularly interested in orienting myself with regards to the “potential outcomes” view of causation, which seems to be the current big thing among epidemiologists.
The potential outcomes view is associated in particular with Miguel Hernan, a very smart epidemiologist at Harvard, and he has a number of nice papers on it. (I hope I don’t need to say that what follows is not a personal attack: I have great respect for Hernan, and am stimulated by his work. I’m just taking his view as exemplary of the potential-outcomes approach, in the way that philosophers typically do.)
In particular I’ve been engaged in a close reading of a paper on obesity by Hernan and Taubman (2008). Their view, as expressed in that paper, is an interesting mix of pragmatism and idealism. On the one (pragmatic) hand, they argue that causal questions are often ill-formed, and thus unanswerable. There is no answer to the question “What is the effect of body-mass index (BMI) on all-cause mortality?” because the different ways to intervene on BMI may result in different effects on mortality. Diet, exercise, a combination of diet and exercise, smoking, chopping off a limb – these are all ways to reduce BMI. Until we have specified which intervention we have in mind, we cannot meaningfully quantify the contribution of BMI to mortality.
This much is highly reminiscent of contrastivist theories of causation in philosophy. Contrastivist theories take causation to consist in counterfactual dependence, but differ from counterfactual theories in taking the form of causal statements to be implicitly contrastive: not “c causes e” but “c rather than C* causes e rather than E*”, where C* and E* are classes of events that could occur in the absence of c and e respectively. Against this background, Hernan and Taubman’s point is simply that, for an epidemiological investigator, it matters what contrast class we have in mind when we seek to estimate the size of an effect. This is a good point, especially in a context where one hopes to act on a causal finding. One had better be sure that one knows, not only that there is a causal connection between a given exposure and outcome, but also what will happen if a given intervention replaces the factor under investigation. I have called the failure to appreciate this point The Causal Fallacy and linked it to easy errors in prediction (see this previous post and Broadbent 2013, 82).
But there is another more troubling side to the view as it is expressed in this paper: that randomized controlled trials offer a protection against this error, and somehow force us to specify our interventions precisely. The argument for this claim is striking, but on reflection I fear it is specious.
Hernan and Taubman make a striking point: they say that an observational study might appear to be able to answer the question “What is the effect of BMI on all-cause mortality?” via a statistical analysis of data on BMI and mortality, while randomized controlled trials would not be able to answer this question directly: they would only be able to answer questions like: “What is the effect of reducing BMI via dietary interventions? / via exercise? / via both?” This apparent shortcoming of RCTs is, of course, a strength in disguise: the observational study is in fact not so informative, since it does not distinguish the effects of different ways of reducing BMI; while the RCTs do give us this information.
This argument is fallacious, however, for the following reasons.
- An observational study that includes the same information as the RCTs on the methods of reducing BMI would also be able to distinguish between the effects of these interventions.
- It is true that one could conduct an observational study which ignored the possibility that different methods of reducing BMI might themselves have affect mortality. But that would be a bad study, since it would ignore the effects of known confounders. A good study would take these things into account.
- Conversely, it is a mistake to suppose that RCTs offer protection against this sort of error. The BMI case is a special one, precisely because there are so many ways to intervene to reduce BMI and we know that these could affect mortality. In truth, there are many ways to make any intervention. One may take a pill or a capsule or a suppository, on the equator or in the tropics, before or after a meal, and so on. Even in an RCT, the intervention is not fully specified. Rather, we simply assume that the differences don’t matter, or that if they do, they are “cancelled out” by the randomisation process.
- Randomized controlled trials are not controlled in the manner of true controlled experiments; rather, randomization is a surrogate for controlling. We hope that all the many differences between the circumstances of each intervention in the treatment group will either have no effect or, if they do, will have effects that are randomly distributed so as not to obscure the effect of the treatment. But in principle, it is still possible that this hope is not fulfilled. At a p-value of 0.05 this will happen in one RCT in 20; and perhaps more often in published RCTs, given publication bias (i.e. the fact that null results are harder to publish).
These are familiar points in the philosophical literature on randomised controlled trials (see esp. Worrall 2002). The point I wish to pull out is this. On the one hand, Hernan’s emphasis on getting a well-defined contrastive question is insightful and important. But on the other hand, it is wrong to think that RCTs solve the problem. True, in an RCT you must make an intervention. But it does not follow that one’s intervention is well-specified. There might be all sorts of features of the particular way that you intervene that could skew the results. And conversely, plug the corresponding “how it happened” info into a cohort study, and you will be able to obtain the same sorts of discrimination between these methods.
On top of all this, the focus on the methods of individual studies obscures the most important point of all: that convincing evidence comes from a multitude of studies. Just as an RCT allows us to assume that differences between individuals are evenly distributed and thus ignorable, so a multitude of methodologically inferior studies can provide very strong evidence if their methodological shortcomings are different. This is the kind of situation Hill responded to with his guidelines (NOT criteria!) for inferring causality (Hill 1965). Similarly, ad hoc arguments against each possible alternative explanation can add up to a compelling case, as in the classic paper by Cornfield and colleagues on smoking and lung cancer (Cornfield et al 1959). The recent insights of the potential outcomes approach are valuable and important, but they augment rather than replace these familiar, older insights.
Broadbent, A. 2013. Philosophy of Epidemiology. Basingstoke and New York: Palgrave Macmillan.
Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB and Wynder EL. 1959. Smoking and lung cancer: recent evidence and a discussion of some questions. Journal of the National Cancer Institute 22: 173-203.
Hernan, MA and Taubman, SL. 2008. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity 32: S8-S14.
Hill, Austin Bradford. 1965. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine 58: 259-300.
Worrall, J. 2002. What Evidence in Evidence-Based Medicine? The British Journal of the Philosophy of Science 58: 451-488.