Causal Inference in Statistics fills that gap. Using simple examples and plain language, the book lays out how to define causal parameters; the assumptions necessary to estimate causal parameters in a variety of situations; how to express those assumptions mathematically; whether those assumptions have testable implications; how to predict the effects of interventions; and how to reason counterfactually. These are the foundational tools that any student of statistics needs to acquire in order to use statistical methods to answer causal questions of interest.
This book is accessible to anyone with an interest in interpreting data, from undergraduates, professors, researchers, or to the interested layperson. Examples are drawn from a wide variety of fields, including medicine, public policy, and law; a brief introduction to probability and statistics is provided for the uninitiated; and each chapter comes with study questions to reinforce the readers understanding.
We have all heard the old saying “correlation is not causation”. This is a problem for statistics, since all it can measure is correlation. Pearl here argues that this is because statisticians are restricting themselves too much, and that it is possible to do more. There is no magic; to get this more, you have to add something into the system, but that something is very reasonable: a causal model.
He organises his argument using the three-runged “ladder of causation”. On the bottom rung is pure statistics, reasoning about observations: what is the probability of recovery, found from observing these people who have taken a drug. The second rung allows reasoning about interventions: what is the probability of recovery, if I were to give these other people the drug. And the top rung includes reasoning about counterfactuals: what would have happened if that person had not received the drug?
Intervention (rung 2) is different from observation alone (rung 1) because the observations may be (almost certainly are) of a biassed group: observing only those who took the drug for whatever reason, maybe because they were already sick in a particular hospital, or because they were rich enough to afford it, or some other confounding variable. The intervention, however, is a different case: people are specifically given the drug. The purely statistical way of moving up to rung 2 is to run a randomised control trial (RCT), to remove the effect of confounding variables, and thereby to make the observed results the same as the results from intervention. The RCT is often known as the “gold standard” for experimental research for this reason.
But here’s the thing: what is a confounding variable, and what is not? In order to know what to control for, and what to ignore, the experimenter has to have some kind of implicit causal model in their head. It has to be implicit, because statisticians are not allowed to talk about causality! Yet it must exist to some degree, otherwise how do we even know which variables to measure, let alone control for? Pearl argues to make this causal model explicit, and use it in the experimental design. Then, with respect to this now explicit causal model, it is possible to reason about results more powerfully. (He does not address how to discover this model: that is a different part of the scientific process, of modelling the world. However, observations can be used to test the model to some degree: some models are simply too causally strong to support the observed situation.)
Pearl uses this framework to show how and why the RCT works. More importantly, he also shows that it is possible to reason about interventions sometimes from observations alone (hence data mining pure observations becomes more powerful), or sometimes with fewer controlled variables, without the need for a full RCT. This is extremely useful, since there are many cases where RCTs are unethical, impractical, or too expensive. RCTs are not the “gold standard” after all; they are basically a dumb sledgehammer approach. He also shows how to use the causal model to calculate which variables do need to be controlled for, and how controlling for certain variables is precisely the wrong thing to do.
Using such causal models also allows us to ascend to the third rung: reasoning about counterfactuals, where experiments are in principle impossible. This gives us power to reason about different worlds: What’s the probability that Fred would have died from lung cancer if he hadn’t smoked? What’s the probability that heat wave would have happened with less CO2 in the atmosphere?
This is a very nicely written book, with many real world examples. The historical detail included shows how and why statisticians neglected causality. It is not always an easy read – the concepts are quite intricate in places – but it is a crucially important read. We should never again bow down to “correlation is not causation”: we now know how to discover when it is.