Brian Cajes’s Kindle Notes & Highlights for The Book of Why: The New Science of Cause and Effect

Rate it:

Open Preview

More on this book

Community

Phillip Hunter

1 note & 59 highlights

Brad Balderson

42 notes & 42 highlights

Michael Hayes

9 notes & 502 highlights

1 note & 102 highlights

Joanne McKinnon

4 notes & 5 highlights

Alexander Telfar

16 notes & 47 highlights

Mark Gerstein

Benjamin Caldwell

Matt

Christopher

Devika

Roozbeh Daneshvar

Harald G.

Vadim Dmitriev

Nick Rong

Bronwyn

Juan Martin

Aurghyadip

Dale Alleshouse

Ian Pitchford

Mario Schlosser

Magnus

Alok Kejriwal

George Leontiev

Tom Semple

Bon Osonwanne

Benjamin

Nancy

Josh

Rahul Krishna

Mike

Eric Yang

Kindle Notes & Highlights

by Brian Cajes

See all Brian’s Notes & Highlights

The Book of Why: The New Science of Cause and Effect

by Judea Pearl

31%

randomization actually brings two benefits. First, it eliminates confounder bias (it asks Nature the right question). Second, it enables the researcher to quantify his uncertainty.

31%

Now, ninety years later, we can use the do-operator to fill in what Fisher wanted to but couldn’t ask. Let’s see, from a causal point of view, how randomization enables us to ask the genie the right question.

32%

That brings us to the punch line: randomization is a way of simulating Model 2. It disables all the old confounders without introducing any new confounders.

32%

That is the source of its power; there is nothing mysterious or mystical about it. It is nothing more or less than, as Joan Fisher Box said, “the skillful interrogation of Nature.”

32%

The experiment would, however, fail in its objective of simulating Model 2 if either the experimenter were allowed to use his...

This highlight has been truncated due to consecutive passage length restrictions.

32%

I will add to this a second punch line: there are other ways of simulating Model 2. One way, if you know what all the possible confounders are, is to measure and adjust for them. However, randomization does have one great advantage: it severs every incoming link to the randomized variable, including the ones we don’t know about or cannot measure (e.g., “Other” factors in Figures 4.4 to 4.6).

32%

By contrast, in a nonrandomized study, the experimenter must rely on her knowledge of the subject matter. If she is confident that her causal model accounts for a sufficient number of deconfounders and she has gathered data on them, then she can estimate the effect of Fertilizer on Yield in an unbiased way. But the danger is that she may have missed a confounding factor, and her estimate may therefore be biased.

32%

All things being equal, RCTs are still preferred to observational studies, just as safety nets are reco...

This highlight has been truncated due to consecutive passage length restrictions.

32%

But all things are not necessarily equal. In some cases, intervention may be physically impossible (for instance, in a study of the effect of obesity on heart disease, we cann...

This highlight has been truncated due to consecutive passage length restrictions.

32%

Fortunately, the do-operator gives us scientifically sound ways of determining causal effects from nonexperimental studies,

32%

Lacking a principled understanding of confounding, scientists could not say anything meaningful in observational studies where physical control over treatments is infeasible.

32%

The question we want to ask of Nature has to do with the causal relationship between X and Y, which is captured by the interventional probability P(Y | do(X)). Confounding, then, should simply be defined as anything that leads to a discrepancy between the two: P(Y | X) ≠ P(Y | do(X)). Why all the fuss?

34%

In fact, the noncausal paths are precisely the source of confounding. Remember that I define confounding as anything that makes P(Y | do(X)) differ from P(Y | X).

34%

The do-operator erases all the arrows that come into X, and in this way it prevents any information about X from flowing in the noncausal direction. Randomization has the same effect. So does statistical adjustment, if we pick the right variables to adjust.

34%

Controlling for descendants (or proxies) of a variable is like “partially” controlling for the variable itself. Controlling for a descendant of a mediator partly closes the pipe; controlling for a descendant of a collider partly opens the pipe.

34%

a back-door path is any path from X to Y that starts with an arrow pointing into X.

34%

X and Y will be deconfounded if we block every back-door path (because such paths allow spurious correlation between X and Y). If we do this by controlling for some set of variables Z, we also need to make sure that no member of Z is a descendant of X on a causal path; otherwise we might partly or completely close off that path.

34%

This path is already blocked by the collider at B, so we don’t need to control for anything. Many statisticians would control for B or C, thinking there is no harm in doing so as long as they occur before the treatment. A leading statistician even recently wrote, “To avoid conditioning on some observed covariates… is nonscientific ad hockery.” He is wrong; conditioning on B or C is a poor idea because it would open the noncausal path and therefore confound X and Y.

34%

that there may be different strategies for deconfounding.

35%

Unfortunately, in the seat-belt example, A and C are variables relating to people’s attitudes and not likely to be observable. If you can’t observe it, you can’t adjust for it.

35%

Few moments in a scientific career are as satisfying as taking a problem that has puzzled and confused generations of predecessors and reducing it to a straightforward game or algorithm.

35%

I consider the complete solution of the confounding problem one of the main highlights of the Causal Revolution

35%

past. It has been a quiet revolution, raging primarily in research laboratories and scientific meetings. Yet, armed with these new tools and insights, the scientific community is now tackling harder problems, both th...

This highlight has been truncated due to consecutive passage length restrictions.

39%

Smoking may be harmful in that it contributes to low birth weight, but certain other causes of low birth weight, such as serious or life-threatening genetic abnormalities, are much more harmful.

39%

There are two possible explanations for low birth weight in one particular baby: it might have a smoking mother, or it might be affected by one of those other causes. If we find out that the mother is a smoker, this explains away the low weight and consequently reduces the likelihood of a serious birth defect.

39%

But if the mother does not smoke, we have stronger evidence that the cause of the low birth weight is a birth defect, and ...

This highlight has been truncated due to consecutive passage length restrictions.

39%

We can see that the birth-weight paradox is a perfect example of collider bias. The collider is Birth Weight.

39%

By looking only at babies with low birth weight, we are conditioning on that collider. This opens up a back-door path between Smoking and Mortality that goes Smoking Birth Weight Birth Defect Mortality.

39%

This path is noncausal because one of the arrows go...

This highlight has been truncated due to consecutive passage length restrictions.

39%

Nevertheless, it induces a spurious correlation between Smoking and Mortality and biases our estimate of the actual (direct) causal effect, Smoking Mortality. In fact, it biases the estimate to such a ...

This highlight has been truncated due to consecutive passage length restrictions.

40%

He who confronts the paradoxical exposes himself to reality. —FRIEDRICH DÜRRENMATT (1962)

41%

But when we condition on a collider, we create a spurious dependence between its parents.

42%

A correlation between x and y can not come about unless there is a causal effect between the two or other mediating variables

45%

Simpson’s paradox: exercise appears to be beneficial (downward slope) in each age group but harmful (upward slope) in the population as a whole.

46%

The most familiar methods to estimate the effect of an intervention, in the presence of confounders, are the back-door adjustment and instrumental variables.

46%

The method of front-door adjustment was unknown before the introduction of causal diagrams. The do-calculus, which my students have fully automated, makes it possible to tailor the adjustment method to any particular causal diagram. (Source: Drawing by Dakota Harr.)

46%

He whose actions exceed his theory, his theory shall endure. —RABBI HANINA BEN DOSA (FIRST CENTURY AD)

46%

Confounding was the primary obstacle that caused us to confuse seeing with doing. Having removed this obstacle with the tools of “path blocking” and the back-door criterion, we can now map the routes up Mount Intervention with systematic precision.

46%

describes a “universal mapping tool” called the do-calculus, which allows the researcher to explore and plot all possible routes up Mount Intervention, no matter how twisty.

46%

Statisticians have devised ingenious methods for handling this “curse of dimensionality” problem. Most involve some sort of extrapolation, whereby a smooth function is fitted to the data and used to fill in the holes created by the empty strata. The most widely used smoothing function is of course a linear approximation, which served as the workhorse of most quantitative work in the social and behavioral sciences in the twentieth century.

47%

truth. Regression coefficients, whether adjusted or not, are only statistical trends, conveying no causal information in themselves.

47%

so crucial that Sewall Wright distinguished path coefficients (which represent causal effects) from regression coefficients (which represent trends of data points). Path coefficients are fundamentally different from regression coefficients, although they can often be computed from the latter.

47%

Wright failed to realize, however, as did all path analysts and econometricians after him, that his computations were unnecessarily complicated.

47%

Keep in mind also that the regression-based adjustment works only for linear models, which involve a major modeling assumption. With linear models, we lose the ability to model nonlinear interactions, such as when the effect of X on Y depends on the level of Z. The back-door adjustment, on the other hand, still works fine even when we have no idea what functions are behind the arrows in the diagrams. But in this so-called nonparametric case, we need to employ other extrapolation methods to deal with the curse of dimensionality.

47%

To sum up, the back-door adjustment formula and the back-door criterion are like the front and back of a coin. The back-door criterion tells us which sets of variables we can use to deconfound our data. The adjustment formula actually does the deconfounding.

47%

Unfortunately, though, adjustment does not work at all if there is a back-door path we cannot block because we don’t have the requisite data. Yet we can still use certain tricks even in this situation. I will tell you about one of my favorite methods next, called the front-door adjustment. Even though it was published more than twenty years ago, only a handful of researchers have taken advantage of this shortcut up Mount Intervention, and I am convinced that its full potential remains untapped.

47%

Suppose we are doing an observational study and have collected data on Smoking, Tar, and Cancer for each of the participants. Unfortunately, we cannot collect data on the Smoking Gene because we do not know whether such a gene exists. Lacking data on the confounding variable, we cannot block the back-door path Smoking Smoking Gene Cancer. Thus we cannot use back-door adjustment to control for the effect of the confounder. So we must look for another way. Instead of going in the back door, we can go in the front door! In this case, the front door is the direct causal path Smoking Tar Cancer, ...more

See a Problem?

Preview — The Book of Why by Judea Pearl