The Book of Why: The New Science of Cause and Effect
Rate it:
Read between September 21, 2020 - March 19, 2021
3%
Flag icon
P(L | D) may be totally different from P(L | do(D)). This difference between seeing and doing is fundamental and explains why we do not regard the falling barometer to be a cause of the coming storm. Seeing the barometer fall increases the probability of the storm, while forcing it to fall does not affect this probability.
3%
Flag icon
The ability to reflect on one’s past actions and envision alternative scenarios is the basis of free will and social responsibility.
6%
Flag icon
the connection between imagining and causal relations is almost self-evident. It is useless to ask for the causes of things unless you can imagine their consequences.
7%
Flag icon
We say that one event is associated with another if observing one changes the likelihood of observing the other.
7%
Flag icon
The goal of strong AI is to produce machines with humanlike intelligence, able to converse with and guide humans. Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality.
7%
Flag icon
Intervention ranks higher than association because it involves not just seeing but changing what is.
12%
Flag icon
Sons of tall men tend to be taller than average—but not as tall as their fathers. Sons of short men tend to be shorter than average—but not as short as their fathers.
12%
Flag icon
If students take two different standardized tests on the same material, the ones who scored high on the first test will usually score higher than average on the second test but not as high as they did the first time. This phenomenon of regression to the mean is ubiquitous in all facets of life, education, and business.
14%
Flag icon
“Whig history” was the epithet used to mock the hindsighted style of history writing, which focused on successful theories and experiments and gave little credit to failed theories and dead ends. The modern style of history writing became more democratic, treating chemists and alchemists with equal respect and insisting on understanding all theories in the social context of their own time.
20%
Flag icon
Unlike correlation and most of the other tools of mainstream statistics, causal analysis requires the user to make a subjective commitment. She must draw a causal diagram that reflects her qualitative belief—or, better yet, the consensus belief of researchers in her field of expertise—about the topology of the causal processes at work. She must abandon the centuries-old dogma of objectivity for objectivity’s sake. Where causation is concerned, a grain of wise subjectivity tells us more about the real world than any amount of objectivity.
20%
Flag icon
In addition, in many cases it can be proven that the influence of prior beliefs vanishes as the size of the data increases, leaving a single objective conclusion in the end.
28%
Flag icon
A Bayesian network is literally nothing more than a compact representation of a huge probability table.
29%
Flag icon
Confounding bias occurs when a variable influences both who is selected for the treatment and the outcome of the experiment.
31%
Flag icon
Nature is like a genie that answers exactly the question we pose, not necessarily the one we intend to ask.
31%
Flag icon
Fisher realized that an uncertain answer to the right question is much better than a highly certain answer to the wrong question.
32%
Flag icon
Confounding, then, should simply be defined as anything that leads to a discrepancy between the two: P(Y | X) ≠ P(Y | do(X)).
34%
Flag icon
I define confounding as anything that makes P(Y | do(X)) differ from P(Y | X).
34%
Flag icon
a back-door path is any path from X to Y that starts with an arrow pointing into X. X and Y will be deconfounded if we block every back-door path (because such paths allow spurious correlation between X and Y). If we do this by controlling for some set of variables Z, we also need to make sure that no member of Z is a descendant of X on a causal path; otherwise we might partly or completely close off that path.
35%
Flag icon
I consider the complete solution of the confounding problem one of the main highlights of the Causal Revolution because it ended an era of confusion that has probably resulted in many wrong decisions in the past.
37%
Flag icon
“dose-response effect”: if substance A causes a biological effect B, then usually (though not always) a larger dose of A causes a stronger response B.
40%
Flag icon
the cultural shocks that emanate from new scientific findings are eventually settled by cultural realignments that accommodate those findings—not by concealment. A prerequisite for this realignment is that we sort out the science from the culture before opinions become inflamed.
40%
Flag icon
Paradoxes arise when we misapply the rules we have learned in one realm to the other.
41%
Flag icon
The lesson is quite simple: the way that we obtain information is no less important than the information itself.
41%
Flag icon
This is a general theme of Bayesian analysis: any hypothesis that has survived some test that threatens its validity becomes more likely. The greater the threat, the more likely it becomes after surviving.
41%
Flag icon
In my opinion, a true resolution of a paradox should explain why we see it as a paradox in the first place.
42%
Flag icon
conditioning on a collider creates a spurious association
42%
Flag icon
We live our lives as if the common cause principle were true. Whenever we see patterns, we look for a causal explanation. In fact, we hunger for an explanation, in terms of stable mechanisms that lie outside the data. The most satisfying kind of explanation is direct causation: X causes Y. When that fails, finding a common cause of X and Y will usually satisfy us.
44%
Flag icon
Simpson’s paradox alerts us to cases where at least one of the statistical trends (either in the aggregated data, the partitioned data, or both) cannot represent the causal effects.
47%
Flag icon
Path coefficients are fundamentally different from regression coefficients, although they can often be computed from the latter.
49%
Flag icon
Rule 1 says when we observe a variable W that is irrelevant to Y (possibly conditional on other variables Z), then the probability distribution of Y will not change.
50%
Flag icon
We know that if a set Z of variables blocks all back-door paths from X to Y, then conditional on Z, do(X) is equivalent to see(X). We can, therefore, write P(Y | do(X), Z) = P(Y | X, Z) if Z satisfies the back-door criterion. We adopt this as Rule 2 of our axiomatic system.
50%
Flag icon
Rule 3 is quite simple: it essentially says that we can remove do(X) from P(Y | do(X)) in any case where there are no causal paths from X to Y. That is, P(Y | do(X)) = P(Y) if there is no path from X to Y with only forward-directed arrows.
54%
Flag icon
From the point of view of causal analysis, this teaches us a good lesson: in any study of interventions, we need to ask whether the variable we’re actually manipulating (lifetime LDL levels) is the same as the variable we think we are manipulating (current LDL levels). This is part of the “skillful interrogation of nature.”
55%
Flag icon
Responsibility and blame, regret and credit: these concepts are the currency of a causal mind. To make any sense of them, we must be able to compare what did happen with what would have happened under some alternative hypothesis.
55%
Flag icon
our ability to conceive of alternative, nonexistent worlds separated us from our protohuman ancestors and indeed from any other creature on the planet. Every other creature can see what is. Our gift, which may sometimes be a curse, is that we can see what might have been.
58%
Flag icon
mistaking a mediator for a confounder is one of the deadliest sins in causal inference and may lead to the most outrageous errors. The latter invites adjustment; the former forbids it.
67%
Flag icon
In fact he had the right idea when he distinguished between bias and discrimination. Bias is a slippery statistical notion, which may disappear if you slice the data a different way. Discrimination, as a causal concept, reflects reality and must remain stable.
74%
Flag icon
Anytime you see a paper or a study that analyzes the data in a model-free way, you can be certain that the output of the study will merely summarize, and perhaps transform, but not interpret the data.
75%
Flag icon
Data interpretation means hypothesizing on how things operate in the real world.