Mike’s Kindle Notes & Highlights for The Book of Why: The New Science of Cause and Effect

Rate it:

More on this book

Community

Phillip Hunter

1 note & 59 highlights

Brad Balderson

42 notes & 42 highlights

Michael Hayes

9 notes & 502 highlights

1 note & 102 highlights

Joanne McKinnon

4 notes & 5 highlights

Brian Cajes

1 note & 46 highlights

Alexander Telfar

16 notes & 47 highlights

Mark Gerstein

Benjamin Caldwell

Matt

Christopher

Devika

Roozbeh Daneshvar

Harald G.

Vadim Dmitriev

Nick Rong

Bronwyn

Juan Martin

Aurghyadip

Dale Alleshouse

Ian Pitchford

Mario Schlosser

Magnus

Alok Kejriwal

George Leontiev

Tom Semple

Bon Osonwanne

Benjamin

Nancy

Josh

Rahul Krishna

Eric Yang

Kindle Notes & Highlights

by Mike

See all Mike’s Notes & Highlights

The Book of Why: The New Science of Cause and Effect

by Judea Pearl

All because we asked a simple question: Why?

questions like these: •

This confusion between seeing and doing has resulted in a fountain of paradoxes, some of which we will entertain in this book.

When the scientific question of interest involves retrospective thinking, we call on another type of expression unique to causal reasoning called a counterfactual.

The first of the outputs is a Yes/No decision as to whether the given query

First, notice that we collect data only after we posit the causal model, after we state the scientific query we wish to answer, and after we derive the estimand.

Chapter 1 assembles the three steps of observation, intervention, and counterfactuals into the Ladder of

This modularity is a key feature of causal models.

For that, you need to have achieved a level of understanding that permits imagining.

At the first level, association, we are looking for regularities in observations.

Nevertheless, deep learning has succeeded primarily by showing that certain questions or tasks we thought were difficult are in fact not.

This lack of flexibility and adaptability is inevitable in any system that works at the first level of the Ladder of Causation.

Many scientists have been quite traumatized to learn that none of the methods they learned in statistics is sufficient even to articulate, let alone answer, a simple question like “What happens if we double the price?”

A very direct way to predict the result of an intervention is to experiment with it under carefully controlled conditions.

These queries take us to the top rung of the Ladder of Causation, the level of counterfactuals,

characteristic queries for the third rung of the Ladder of Causation are “What if I had done…?” and “Why?”

How can machines (and people) represent causal knowledge in a way that would enable them to access the necessary information swiftly, answer questions correctly, and do it with ease, as a three-year-old child can? In fact, this is the main question we address in this book.

One major contribution of AI to the study of cognition has been the paradigm “Representation first, acquisition second.”

We also set that variable manually to its prescribed value (true). The rationale for this peculiar “surgery” is simple: making an event happen means that you emancipate it from all other influences and subject it to one and only one influence—that which enforces its happening.

then the opposite is true. This

10%

When we draw an arrow from X to Y, we are implicitly saying that some probability rule or function specifies how Y would change if X were to change.

10%

Decades’ worth of experience with these kinds of questions has convinced me that, in both a cognitive and a philosophical sense, the idea of causes and effects is much more fundamental than the idea of probability.

10%

estimand (i.e., recipe for answering the query)

11%

The expression P(Y | X) > P(Y), on the other hand, speaks only about observations and means: “If we see X, then the probability of Y increases.”

11%

She proposed that we should condition on any factor that is “causally relevant” to the effect.

13%

Eventually, he opted for the more normal English word “correlated.”

13%

Later he realized an even more startling fact: in generational comparisons, the temporal order could be reversed.

13%

The answer is that we are talking not about an individual father and an individual son but about two populations.

13%

Galton stumbled on an important fact: the predictions always fall on a line, which he called the regression line, which is less steep than the major axis (or axis of symmetry) of

13%

Once again this shows that where regression to the mean is concerned, there is no difference between cause and effect.

14%

For the first time, Galton’s idea of correlation gave an objective measure, independent of human judgment or interpretation, of how two variables are related to one another.

14%

he was led astray by his beautiful but flawed causal model, and later, having discovered the beauty of correlation, he came to believe that causality was no longer needed.

14%

statistics became a model-blind data-reduction enterprise,

15%

To summarize, causation for Pearson is only a matter of repetition and, in the deterministic sense, can never be proven.

15%

More generally, Pearson belonged to a philosophical school called positivism, which holds that the universe is a product of human thought and that science is only a description of those thoughts.

15%

This example is a case of a more general phenomenon called Simpson’s paradox.

17%

Although we don’t need to know every causal relation between the variables of interest and might be able to draw some conclusions with only partial information, Wright makes one point with absolute clarity: you cannot draw causal conclusions without some causal hypotheses. This echoes what we concluded in Chapter 1: you cannot answer a question on rung two of the Ladder of Causation using only data collected from rung one.

17%

Many people still make Niles’s mistake of thinking that the goal of causal analysis is to prove that X is a cause of Y or else to find the cause of Y from scratch.

17%

In contrast, the focus of Wright’s research, as well as this book, is representing plausible causal knowledge in some mathematical language, combining it with empirical data, and answering causal queries that are of practical value.

18%

taught him that the surest kind of knowledge is what you construct yourself.

18%

And now the algebraic magic: the amount of bias is equal to the product of the path coefficients along that path (in other words, l times l′ times q). The total correlation, then, is just the sum of the path coefficients along the two paths: algebraically, p + (l × l′ × q) = 5.66 grams per day.

18%

But here’s where the ingenuity of path coefficients really shines. Wright’s methods tell us how to express each of the measured correlations in terms of the path coefficients. After doing this for each of the measured pairs (P, X), (L, X), and (L, P), we obtain three equations that can be solved algebraically for the unknown path coefficients, p, l′, and l × q.

18%

Lesson one from this example: causal analysis allows us to quantify processes in the real world, not just patterns in the data.

18%

Lesson two, whether you followed the mathematics or not: in path analysis you draw conclusions about individual causal relationships by examining the diagram as a whole.

19%

The user has to have a hypothesis and must devise an appropriate diagram of multiple causal sequences.”

19%

path analysis requires scientific thinking, as does every exercise in causal inference.

19%

“Statistics may be regarded as… the study of methods of the reduction of data.”

19%

But Fisher was right about one point: once you remove causation from statistics, reduction of data is the only thing left.

19%

Of interest to us are two of Karlin’s arguments.

20%

For Wright, drawing a path diagram is not a statistical exercise; it is an exercise in genetics, economics, psychology, or whatever the scientist’s own field of expertise is.

« Prev 1 2 Next »