Is consistency trivial in randomized controlled trials?

Here are some more thoughts on Hernan and Taubman’s famous 2008 paper, from a chapter I am finalising for the epidemiology entry in a collection on the philosophy of medicine. I realise I have made a similar point in an earlier post on this blog, but I think I am getting closer to a crisp expression. The point concerns the claimed advantage of RCTs for ensuring consistency. Thoughts welcome!

Hernan and Taubman are surely right to warn against too-easy claims about “the effect of obesity on mortality”, when there are multiple ways to reduce obesity, each with different effects on mortality, and perhaps no ethically acceptable way to bring about a sudden change in body mass index from say 30 to 22 (Hernán and Taubman 2008, 22). To this extent, their insistence on assessing causal claims as contrasts to well-defined interventions is useful.

On the other hand, they imply some conclusions that are harder to accept. They suggest, for example, that observational studies are inherently more likely to suffer from this sort of difficulty, and that experimental studies (randomized controlled trials) will ensure that interventions are well-specified. They express their point using the technical term “consistency”:

consistency… can be thought of as the condition that the causal contrast involves two or more well-defined interventions. (Hernán and Taubman 2008, S10)

They go on:

…consistency is a trivial condition in randomized experiments. For example, consider a subject who was assigned to the intervention group … in your randomized trial. By definition, it is true that, had he been assigned to the intervention, his counterfactual out- come would have been equal to his observed outcome. But the condition is not so obvious in observational studies. (Hernán and Taubman 2008, s11)

This is a non-sequitur, however, unless we appeal to a background assumption that an intervention—something that an actual human investigator actually does—is necessarily well-defined. Without this assumption, there is nothing to underwrite the claim that “by definition”, if a subject actually assigned to the intervention had been assigned to the intervention, he would have had the outcome that he actually did have.

Consider the intervention in their paper, one hour of strenuous exercise per day. “Strenuous exercise” is not a well-defined intervention. Weightlifting? Karate? Swimming? The assumption behind their paper seems to be that if an investigator “does” an intervention, it is necessarily well-defined; but on reflection this is obviously not true. An investigator needs to have some knowledge of which features of the intervention might affect the outcome (such as what kind of exercise one performs), and thus need to be controlled, and which don’t (such as how far west of Beijing one lives). Even randomization will not protect against confounding arising from preference for a certain type of exercise (perhaps because people with healthy hearts are predisposed both to choose running and to live longer, for example), unless one knows to randomize the assignment of exercise-types and not to leave it to the subjects’ choice.

This is exactly the same kind of difficulty that Hernan and Taubman press against observational studies. So the contrast they wish to draw, between “trivial” consistency in randomized trials and a much more problematic situation in observational studies, is a mirage. Both can suffer from failure to define interventions.


Is the Methodological Axiom of the Potential Outcomes Approach Circular?

Hernan, VanderWeele, and others argue that causation (or a causal question) is well-defined when interventions are well-specified. I take this to be a sort of methodological axiom of the approach.

But what is a well-specified intervention?

Consider an example from Hernan & Taubman’s influential 2008 paper on obesity. In that paper, BMI is shown up as failing to correspond to a well-specified intervention; better-specifed interventions include one hour of strenuous physical exercise per day (among others).

But what kind of exercise? One hour of running? Powerlifting? Yoga? Boxing?

It might matter – it might turn out that, say, boxing and running for an hour a day reduce BMI by similar amounts but that one of them is associated with longer life. Or it might turn out not to matter. Either way, it would be a matter of empirical inquiry.

This has two consequences for the mantra that well-defined causal questions require well-specified interventions.

First, as I’ve pointed out before on this blog, it means that experimental studies don’t necessarily guarantee well-specified interventions. Just because you can do it doesn’t mean you know what you are doing. The differences you might think don’t matter might matter: different strains of broccoli might have totally different effects on mortality, etc.

Second, more fundamentally, it means that the whole approach is circular. You need a well-specified intervention for a good empirical inquiry into causes and you need good empirical inquiry into causes to know whether your intervention is well-specified.

To me this seems to be a potentially fatal consequence for the claim that well-defined causal questions require well-specified interventions. For if that were true, we would be trapped in a circle, and could never have any well-specified interventions, and thus no well-defined causal questions either. Therefore either we really are trapped in that circle; or we can have well-defined causal questions, in which case, it is false that these always require well-specified interventions.

This is a line of argument I’m developing at present, inspired in part by Vandebroucke and Pearce’s critique of the “methodological revolution” at the recent WCE 2014 in Anchorage. I would welcome comments.

A Tale of Two Papers

I’m on my way back from the World Epi Congress in Anchorage, where causation and causal inference have been central topics of discussion. I wrote previously about a paper (Hernan and Taubman 2008) suggesting that obesity is not a cause of mortality. There is another, more recent paper published in July of this year, suggesting, more or less, that race is not a cause of health outcomes – or at least that it’s not a cause that can feature in causal models (Vanderweele and Robinson 2014). I can’t do justice to the paper here, of course, but I think this is a fair, if crude, summary of the strategy.

This paper is an interesting comparator for the 2008 obesity paper (Hernan and Taubman 2008). It shares the idea that there is a close link between (a) what can be humanly intervened on, (b) what counterfactuals we can entertain, and (c) what causes we can meaningfully talk about. This is a radical view about causation, much stronger than any position held by any contemporary philosopher of whom I’m aware. Philosophers who do think that agency or intervention are central to the concept of causation treat the interventions as in-principle ones, not things humans could actually do.

Yet feasibility of manipulating a variable really does seem to be a driver in this literature. In the paper on race, the authors consider what variables form the subject of humanly possible interventions, and suggest that rather than ask about the effect of race, we should ask what effect is left over after these factors are modelled and controlled for, under the umbrella of socioeconomic status. That sounds to me a bit like saying that we should identify the effects of being female on job candidates’ success by seeing what’s left after controlling for skirt wearing, longer average hair length, shorter stature, higher pitched voice, female names, etc. In other words, it’s very strange indeed. Perhaps it could be useful in some circumstances, but it doesn’t really get us any further with the question of interest – how to quantify the health effects of race, sex, and so forth.

Clearly, there are many conceptual difficulties with this line of reasoning. A good commentary was published with the paper (Glymour and Glymour 2014) which really dismantles the logic of the paper. But I think there are a number of deeper and more pervasive misunderstandings to be cleared up, misunderstandings which help explain why papers like this are being written at all. One is confusion between causation and causal inference; another is confusion between causal inference and particular methods of causal inference; and a third is a mix-up between fitting your methodological tool to your problem, and your problem to your tool.

The last point is particularly striking. What’s so interesting about these two papers (2008 & 2014) is that they seem to be trying to fit research problems to methods, not trying to develop methods to solve problems – even though this is ostensibly what they (at least VW&R 20114) are trying to do. To me, this is strongly reminiscent of Thomas Kuhn’s picture of science, according to which an “exemplary” bit of science occurs, and initiates a “paradigm”, which is a shared set of tools for solving “puzzles”. Kuhn was primarily influenced by physics, but this way of seeing things seems quite apt to explain what is otherwise, from the outside, really quite a remarkable, even bizarre about-turn. Age, sex, race – these are staple objects of epidemiological study as determinants of health; and they don’t fit easily into the potential outcomes paradigm. It’s fascinating to watch the subsequent negotiation. But I’m quite glad that it doesn’t look like epidemiologists are going to stop talking about these things any time soon.


Glymour C and Glymour MR. 2014. ‘Race and Sex Are Causes.’ Epidemiology 25 (4): 488-490.

Hernan M and Taubman S. 2008. ‘Does obesity shorten life? The importance of well-defined interventions to answer causal questions.’ International Journal of Obesity 32: S8–S14.

VanderWeele TJ and Robinson WR. 2014. ‘On the Causal Interpretation of Race in Regressions Adjusting for Confounding and Mediating Variables.’ Epidemiology 25(4): 473-484.