Books

Books : reviews

Judea Pearl.
Probabilistic Reasoning in Intelligent Systems: networks of plausible inference: revised edn.
Morgan Kaufmann. 1991

PROBABILISTIC REASONING IN INTELLIGENT SYSTEMS is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other Al approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic.

The author distinguishes syntactic and semantic approaches to uncertainty — and offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognition — in short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information.

PROBABILISTIC REASONING IN INTELLIGENT SYSTEMS will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in Al, operations research or applied probability.

Second printing (1991) includes expanded BIBLIOGRAPHICAL AND HISTORIGAL REMARKS sections for each chapter and updated current references throughout.

Judea Pearl.
Causality: models, reasoning, and inference.
CUP. 2000

Judea Pearl, Madelyn Glymour, Nicholas P. Jewell.
Causal Inference in Statistics: a primer.
Wiley. 2016

Causality is central to the understanding and use of data. Without an understanding of cause-effect relationships, we cannot use data to answer questions as basic as “Does this treatment harm or help patients?” But though hundreds of introductory texts are available on statistical methods of data analysis, until now, no beginner-level book has been written about the exploding arsenal of methods that can tease causal information from data.

Causal Inference in Statistics fills that gap. Using simple examples and plain language, the book lays out how to define causal parameters; the assumptions necessary to estimate causal parameters in a variety of situations; how to express those assumptions mathematically; whether those assumptions have testable implications; how to predict the effects of interventions; and how to reason counterfactually. These are the foundational tools that any student of statistics needs to acquire in order to use statistical methods to answer causal questions of interest.

This book is accessible to anyone with an interest in interpreting data, from undergraduates, professors, researchers, or to the interested layperson. Examples are drawn from a wide variety of fields, including medicine, public policy, and law; a brief introduction to probability and statistics is provided for the uninitiated; and each chapter comes with study questions to reinforce the readers understanding.

Judea Pearl, Dana Mackenzie.
The Book of Why: the new science of cause and effect.
Penguin. 2018

rating : 2 : great stuff
review : 28 July 2019

We have all heard the old saying “correlation is not causation”. This is a problem for statistics, since all it can measure is correlation. Pearl here argues that this is because statisticians are restricting themselves too much, and that it is possible to do more. There is no magic; to get this more, you have to add something into the system, but that something is very reasonable: a causal model.

He organises his argument using the three-runged “ladder of causation”. On the bottom rung is pure statistics, reasoning about observations: what is the probability of recovery, found from observing these people who have taken a drug. The second rung allows reasoning about interventions: what is the probability of recovery, if I were to give these other people the drug. And the top rung includes reasoning about counterfactuals: what would have happened if that person had not received the drug?

Intervention (rung 2) is different from observation alone (rung 1) because the observations may be (almost certainly are) of a biassed group: observing only those who took the drug for whatever reason, maybe because they were already sick in a particular hospital, or because they were rich enough to afford it, or some other confounding variable. The intervention, however, is a different case: people are specifically given the drug. The purely statistical way of moving up to rung 2 is to run a randomised control trial (RCT), to remove the effect of confounding variables, and thereby to make the observed results the same as the results from intervention. The RCT is often known as the “gold standard” for experimental research for this reason.

But here’s the thing: what is a confounding variable, and what is not? In order to know what to control for, and what to ignore, the experimenter has to have some kind of implicit causal model in their head. It has to be implicit, because statisticians are not allowed to talk about causality! Yet it must exist to some degree, otherwise how do we even know which variables to measure, let alone control for? Pearl argues to make this causal model explicit, and use it in the experimental design. Then, with respect to this now explicit causal model, it is possible to reason about results more powerfully. (He does not address how to discover this model: that is a different part of the scientific process, of modelling the world. However, observations can be used to test the model to some degree: some models are simply too causally strong to support the observed situation.)

Pearl uses this framework to show how and why the RCT works. More importantly, he also shows that it is possible to reason about interventions sometimes from observations alone (hence data mining pure observations becomes more powerful), or sometimes with fewer controlled variables, without the need for a full RCT. This is extremely useful, since there are many cases where RCTs are unethical, impractical, or too expensive. RCTs are not the “gold standard” after all; they are basically a dumb sledgehammer approach. He also shows how to use the causal model to calculate which variables do need to be controlled for, and how controlling for certain variables is precisely the wrong thing to do.

Using such causal models also allows us to ascend to the third rung: reasoning about counterfactuals, where experiments are in principle impossible. This gives us power to reason about different worlds: What’s the probability that Fred would have died from lung cancer if he hadn’t smoked? What’s the probability that heat wave would have happened with less CO2 in the atmosphere?

[p51] probabilities encode our beliefs about a static world, causality tells us whether and how probabilities change when the world changes, be it by intervention or by act of imagination.

This is a very nicely written book, with many real world examples. The historical detail included shows how and why statisticians neglected causality. It is not always an easy read – the concepts are quite intricate in places – but it is a crucially important read. We should never again bow down to “correlation is not causation”: we now know how to discover when it is.