Jump to ratings and reviews
Rate this book

Adaptive Computation and Machine Learning

Causation, Prediction, and Search

Rate this book
This book is intended for anyone, regardless of discipline, who is interested in the use of statistical methods to help obtain scientific explanations or to predict the outcomes of actions, experiments or policies. Much of G. Udny Yule's work illustrates a vision of statistics whose goal is to investigate when and how causal influences may be reliably inferred, and their comparative strengths estimated, from statistical samples. Yule's enterprise has been largely replaced by Ronald Fisher's conception, in which there is a fundamental cleavage between experimental and non experimental inquiry, and statistics is largely unable to aid in causal inference without randomized experimental trials. Every now and then members of the statistical community express misgivings about this turn of events, and, in our view, rightly so. Our work represents a return to something like Yule's conception of the enterprise of theoretical statistics and its potential practical benefits. If intellectual history in the 20th century had gone otherwise, there might have been a discipline to which our work belongs. As it happens, there is not. We develop material that belongs to statistics, to computer science, and to philosophy; the combination may not be entirely satisfactory for specialists in any of these subjects. We hope it is nonetheless satisfactory for its purpose."

530 pages, Hardcover

First published February 24, 1993

7 people are currently reading
312 people want to read

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
9 (34%)
4 stars
6 (23%)
3 stars
10 (38%)
2 stars
1 (3%)
1 star
0 (0%)
Displaying 1 - 4 of 4 reviews
Profile Image for g.lkoa.
24 reviews2 followers
September 21, 2018

{Provisional pitch} ~ slow-read, re-reading, to-be-read-again | very important. All three authors are recognized as leading names in theoretical and inferential statistics (in addition to being widely referenced, also, for their contribution to machine learning).

It is a great/valuable approach to the estimation of causal effects, a topic whose core problems consist – in, perhaps, the humblest graphical model – of something from X to Y, that’s to say a probability function mapping like (Y | cause(X = x)).
A bit of manipulation of this dummie system can be trivial in so far as it is reasonable to trace back the class X to some value x, measure Y, conceive some ‘cause()’ operator, and hence obtain a distribution.

If you cannot make up this lab or observational experiment, then you get a joint distribution for some value of covariates Z – meaning P(X, Y, Z, Zn, …)... which by itself, of course, is not even slightly enough to get a P(Y| cause(X=x)). Joint distribution, y’kno, doesn't identify a causal effect.

The treatement on which this work on causality&inference relies is such that a causal model can be identified, whenever possibile, for some varbile X that is said to be a cause of a variable Y if and only if Y depends on X for its value in some mathematically _explicit_ sense. Which in turn can be expanded in many ways (there’s a few efforts at catering to non professional or non-mathematicians), but the definition just exactly enacts that X is a cause of Y if Y decides its value in response to that X. Causation is then said to be transitive, irreflexive, and antisymmetric.
This whole so far, as a broad concept, might be understandable in the end or even obvious, but in fact it’s essential to formally enstablish a realm where there’s no need to resorting to counterfactuals.

(Accordingly, say, you want to know the effect of X on Y and you’ve gotten around to find a set of V as control variables, then if (1.) V collides all paths-vector from X to Y with a pointer into X and (2.) if no node in V is a successor of X, then you’re sure all the items in your model are observational conditional probabilities. Meaning that V satisfies a back-door criterion; no counterfactuals required, indeed.)

*

Past the first two-three chapters on axioms, statistical indistiguishability and Causally Sufficient Structures, the authors attempt to scrutinize - and, more importantly, test - the notion that even correlations, while not implying causation, should nevertheless have some causal explaining power as unambiguous as possible (an idea, to be sure, as old as Simon's papers on computational complexity in decision making, i.e., ‘bounded rationality’), which is equal to adressing the following non-trivial task: define some classes of correlations, also multi-variable correlations, in order to add some restraints on them within the domain where a pattern of said correlations is legal.

This logic pretty much boils down to the twin notions of Markov equivalence and distribution equivalence. For instance, given an X = {X, Y, Z}, this model may generate a structure such that X → Y → Z, (its opposite) X ← Y ← Z, X ← Y → Z, and X ← Y ← Z, from which one gathers that they represents only the statement that X and Z are conditionally independent given some Y. The first three structures, in fact, consists of an Y separating X from the third variable, and said variables are therfore independent of each other, conditional on Y. Drawing on Verma and Pearl, 1990, two model structures for some variable X are known to be Markov equivalent (meaning fairly indiscernible) if they can exhibit the same set of conditional-independence assertions for X (and the absence of conditional independency definitely suggests which way the causal links lie). So those model structures must be taken as equivalent. This very process of ‘orienting’ correlation chains can be mathematically difficoult, tough to digest, but indeed the greatest value of this book is showing what a causal discovery can be, in all its complexity, as a computational and an inferential problem.

Other significant sections of the textbook are also devoted to the chief notion of partial identification - widely elaboretad by Manski, even thoug not openly mentioned - and partial correltation, a breath-of-fresh-air that always comes into play, because, no matter how unadulterated&perfect and informative data you have, very oftent it isn't yet enough to track some parameter down to an estimate-point value; what is feasible, on the other hand, is to charge some boundary conditions in order to see whether that parameter is at least theoretically identifiable (depending on the models one is willing to buy into and the gradient of ‘strenght’ of the assumptions one is willing to make).

This is, needless to say, tremendously important in non-natural science and policy making problems. The greatest the number of parameters, the less they can be identified with credible assumptions; however, the authors happen to show that some partial identification limits, which are based on not so hard assumptions, can exist and be formally explicit (there are a wide lot of examples from a corpus of traditional assumption parameters - for instance, linear homogeneous curves in economics or instrumental variables or the like, which relies on relatively hard assumptions and leaves virtually no space to uncertainty).

Reading it mostly for consultation, this was a rather scattered and incoherent appraisal. I confess.
Profile Image for Zhijing Jin.
347 reviews61 followers
December 16, 2022
This book sets clear which parts of statistical causality are contributions of the CMU profs, Peter Spirtes, Clark Glymour, and Richard Scheines; which parts are of Judea Pearl; and which are other researchers. It is a book suitable for statistical background, and also it contains lots of details about causal discovery algorithms. E.g., both the SGS and PC algorithms are named after the authors.

Despite the amazing capability to discover causation from correlation under some conditions, the vision ahead is to loosen the assumptions or aligning the assumptions more with real-world scenarios.

Video lecture of the first author to introduce causal discovery at MIT: https://youtu.be/C3iOXymDoIU
Profile Image for Joey Chen.
11 reviews2 followers
June 4, 2025
The structure of this book is probably the best among all the causal inference textbooks I've read. I also appreciate that the authors are not reluctant to mention the limitations of the things they introduce.
However, many parts of it can still be further polished. For instance, the introduction of many abstract concepts look abrupt and unmotivated; the notation is sometimes messy and inconsistent; the typeset is kinda cluttered.
15 reviews
August 28, 2025
Extremely dense, but necessarily so in order to formalize the subject. Very useful in real life, and increasingly so in an AI dominated world.
Displaying 1 - 4 of 4 reviews

Can't find what you're looking for?

Get help and learn more about the design.