The Book of Why: The New Science of Cause and Effect
Rate it:
Kindle Notes & Highlights
1%
Flag icon
All because we asked a simple question: Why?
1%
Flag icon
questions like these: •
3%
Flag icon
This confusion between seeing and doing has resulted in a fountain of paradoxes, some of which we will entertain in this book.
3%
Flag icon
When the scientific question of interest involves retrospective thinking, we call on another type of expression unique to causal reasoning called a counterfactual.
3%
Flag icon
The first of the outputs is a Yes/No decision as to whether the given query
4%
Flag icon
First, notice that we collect data only after we posit the causal model, after we state the scientific query we wish to answer, and after we derive the estimand.
5%
Flag icon
Chapter 1 assembles the three steps of observation, intervention, and counterfactuals into the Ladder of
6%
Flag icon
This modularity is a key feature of causal models.
6%
Flag icon
For that, you need to have achieved a level of understanding that permits imagining.
7%
Flag icon
At the first level, association, we are looking for regularities in observations.
7%
Flag icon
Nevertheless, deep learning has succeeded primarily by showing that certain questions or tasks we thought were difficult are in fact not.
7%
Flag icon
This lack of flexibility and adaptability is inevitable in any system that works at the first level of the Ladder of Causation.
7%
Flag icon
Many scientists have been quite traumatized to learn that none of the methods they learned in statistics is sufficient even to articulate, let alone answer, a simple question like “What happens if we double the price?”
7%
Flag icon
A very direct way to predict the result of an intervention is to experiment with it under carefully controlled conditions.
8%
Flag icon
These queries take us to the top rung of the Ladder of Causation, the level of counterfactuals,
8%
Flag icon
characteristic queries for the third rung of the Ladder of Causation are “What if I had done…?” and “Why?”
9%
Flag icon
How can machines (and people) represent causal knowledge in a way that would enable them to access the necessary information swiftly, answer questions correctly, and do it with ease, as a three-year-old child can? In fact, this is the main question we address in this book.
9%
Flag icon
One major contribution of AI to the study of cognition has been the paradigm “Representation first, acquisition second.”
9%
Flag icon
We also set that variable manually to its prescribed value (true). The rationale for this peculiar “surgery” is simple: making an event happen means that you emancipate it from all other influences and subject it to one and only one influence—that which enforces its happening.
9%
Flag icon
then the opposite is true. This
10%
Flag icon
When we draw an arrow from X to Y, we are implicitly saying that some probability rule or function specifies how Y would change if X were to change.
10%
Flag icon
Decades’ worth of experience with these kinds of questions has convinced me that, in both a cognitive and a philosophical sense, the idea of causes and effects is much more fundamental than the idea of probability.
10%
Flag icon
estimand (i.e., recipe for answering the query)
11%
Flag icon
The expression P(Y | X) > P(Y), on the other hand, speaks only about observations and means: “If we see X, then the probability of Y increases.”
11%
Flag icon
She proposed that we should condition on any factor that is “causally relevant” to the effect.
13%
Flag icon
Eventually, he opted for the more normal English word “correlated.”
13%
Flag icon
Later he realized an even more startling fact: in generational comparisons, the temporal order could be reversed.
13%
Flag icon
The answer is that we are talking not about an individual father and an individual son but about two populations.
13%
Flag icon
Galton stumbled on an important fact: the predictions always fall on a line, which he called the regression line, which is less steep than the major axis (or axis of symmetry) of
13%
Flag icon
Once again this shows that where regression to the mean is concerned, there is no difference between cause and effect.
14%
Flag icon
For the first time, Galton’s idea of correlation gave an objective measure, independent of human judgment or interpretation, of how two variables are related to one another.
14%
Flag icon
he was led astray by his beautiful but flawed causal model, and later, having discovered the beauty of correlation, he came to believe that causality was no longer needed.
14%
Flag icon
statistics became a model-blind data-reduction enterprise,
15%
Flag icon
To summarize, causation for Pearson is only a matter of repetition and, in the deterministic sense, can never be proven.
15%
Flag icon
More generally, Pearson belonged to a philosophical school called positivism, which holds that the universe is a product of human thought and that science is only a description of those thoughts.
15%
Flag icon
This example is a case of a more general phenomenon called Simpson’s paradox.
17%
Flag icon
Although we don’t need to know every causal relation between the variables of interest and might be able to draw some conclusions with only partial information, Wright makes one point with absolute clarity: you cannot draw causal conclusions without some causal hypotheses. This echoes what we concluded in Chapter 1: you cannot answer a question on rung two of the Ladder of Causation using only data collected from rung one.
17%
Flag icon
Many people still make Niles’s mistake of thinking that the goal of causal analysis is to prove that X is a cause of Y or else to find the cause of Y from scratch.
17%
Flag icon
In contrast, the focus of Wright’s research, as well as this book, is representing plausible causal knowledge in some mathematical language, combining it with empirical data, and answering causal queries that are of practical value.
18%
Flag icon
taught him that the surest kind of knowledge is what you construct yourself.
18%
Flag icon
And now the algebraic magic: the amount of bias is equal to the product of the path coefficients along that path (in other words, l times l′ times q). The total correlation, then, is just the sum of the path coefficients along the two paths: algebraically, p + (l × l′ × q) = 5.66 grams per day.
18%
Flag icon
But here’s where the ingenuity of path coefficients really shines. Wright’s methods tell us how to express each of the measured correlations in terms of the path coefficients. After doing this for each of the measured pairs (P, X), (L, X), and (L, P), we obtain three equations that can be solved algebraically for the unknown path coefficients, p, l′, and l × q.
18%
Flag icon
Lesson one from this example: causal analysis allows us to quantify processes in the real world, not just patterns in the data.
18%
Flag icon
Lesson two, whether you followed the mathematics or not: in path analysis you draw conclusions about individual causal relationships by examining the diagram as a whole.
19%
Flag icon
The user has to have a hypothesis and must devise an appropriate diagram of multiple causal sequences.”
19%
Flag icon
path analysis requires scientific thinking, as does every exercise in causal inference.
19%
Flag icon
“Statistics may be regarded as… the study of methods of the reduction of data.”
19%
Flag icon
But Fisher was right about one point: once you remove causation from statistics, reduction of data is the only thing left.
19%
Flag icon
Of interest to us are two of Karlin’s arguments.
20%
Flag icon
For Wright, drawing a path diagram is not a statistical exercise; it is an exercise in genetics, economics, psychology, or whatever the scientist’s own field of expertise is.
« Prev 1