More on this book
Community
Kindle Notes & Highlights
by
Judea Pearl
Read between
January 25 - February 13, 2022
humans began to realize that certain things cause other things and that tinkering with the former can change the latter.
The calculus of causation consists of two languages: causal diagrams, to express what we know, and a symbolic language, resembling algebra, to express what we want to know.
One of the crowning achievements of the Causal Revolution has been to explain how to predict the effects of an intervention without actually enacting it. It would never have been possible if we had not, first of all, defined the do-operator so that we can ask the right question and, second, devised a way to emulate it by noninvasive means.
Counterfactual
First, in the world of AI, you do not really understand a topic until you can teach it to a mechanical robot.
regression to the mean
Thus, Galton conjectured, regression toward the mean was a physical process, nature’s way of ensuring that the distribution of height (or intelligence) remained the same from generation to generation.
Wright was an early advocate of the view that evolution is not gradual, as Darwin had posited, but takes place in relatively sudden bursts.
“To contrast ‘causation’ and ‘correlation’ is unwarranted because causation is simply perfect correlation.” In
Wright makes one point with absolute clarity: you cannot draw causal conclusions without some causal hypotheses.
you cannot answer a question on rung two of the Ladder of Causation using only data collected from rung one.
Sometimes people ask me, “Doesn’t that make causal reasoning circular? Aren’t you just assuming what you want to prove?” The answer is no. By combining very mild, qualitative, and obvious assumptions (e.g., coat color of the son does not influence that of the parents) with his twenty years of guinea pig data, he obtained a quantitative and by no means obvious result: that 42 percent of the variation in coat color is due to heredity. Extracting the nonobvious from the obvious is not circular—it is a scientific triumph and deserves to be hailed as such.
“The writer [i.e., Wright himself] has never made the preposterous claim that the theory of path coefficients provides a general formula for the deduction of causal relations. He wishes to submit that the combination of knowledge of correlations with knowledge of causal relations to obtain certain results, is a different thing from the deduction of causal relations from correlations implied by Niles’ statement.”
Where did Wright get this inner conviction that he was on the right track and the rest of the kindergarten class was just plain wrong? Maybe his Midwestern upbringing and the tiny college he went to encouraged his self-reliance and taught him that the surest kind of knowledge is what you construct yourself.
path analysis,
Sherlock Holmes meets his modern counterpart, a robot equipped with a Bayesian network. In different ways both are tackling the question of how to infer causes from observations.
“When you have eliminated the impossible, whatever remains, however improbable, must be the truth.”
induction and deduction
Bayes, this assertion provoked a natural, one might say Holmesian question: How much evidence would it take to convince us that something we consider improbable has actually happened?
paper is remembered and argued about 250 years later, not for its theology but because it shows that you can deduce the probability of a cause from an effect.
If we know the cause, it is easy to estimate the probability of the effect, which is a forward probability.
Here I must confess that in the teahouse example, by deriving Bayes’s rule from data, I have glossed over two profound objections, one philosophical and the other practical. The philosophical one stems from the interpretation of probabilities as a degree of belief, which we used implicitly in the teahouse example. Who ever said that beliefs act, or should act, like proportions in the data?
“subjectivity,”
tiny number of true positives (i.e., women with breast cancer) is overwhelmed by the number of false positives. Our sense of surprise at this result comes from the common cognitive confusion between the forward probability, which is well studied and thoroughly documented, and the inverse probability, which is needed for personal decision making.
There was no shortage of ideas. Lotfi Zadeh of Berkeley offered “fuzzy logic,” in which statements are neither true nor false but instead take a range of possible truth values.
RCT
CONFOUNDING
case-control study because it compares “cases” (people with a disease) to controls.
consistency (many studies, in different populations, show similar results); strength of association (including the dose-response effect: more smoking is associated with a higher risk); specificity of the association (a particular agent should have a particular effect and not a long litany of effects); temporal relationship (the effect should follow the cause); and coherence (biological plausibility and consistency with other types of evidence such as laboratory experiments and time series).
whole list of nine criteria have become known as “Hill’s criteria.”
collider bias
The first one uses causal reasoning to explain why we observe a spurious dependence between Your Door and Location of Car; the second uses Bayesian reasoning to explain why the probability of Door 2 goes up in Let’s Make a Deal.
Hans Reichenbach made a daring conjecture called the “common cause principle.” Rebutting the adage “Correlation does not imply causation,” Reichenbach posited a much stronger idea: “No correlation without causation.” He meant that a correlation between two variables, X and Y, cannot come about by accident. Either one of the variables causes the other, or a third variable, say Z, precedes and causes them both.
Any claim to resolve a paradox (especially one that is decades old) should meet some basic criteria. First, as I said above in connection with the Monty Hall paradox, it should explain why people find the paradox surprising or unbelievable. Second, it should identify the class of scenarios in which the paradox can occur. Third, it should inform us of scenarios, if any, in which the paradox cannot occur. Finally, when the paradox does occur, and we have to make a choice between two plausible yet contradictory statements, it should tell us which statement is correct.
causal assumption.
In the next chapter we begin our ascent up the Ladder of Causation, beginning with rung two: intervention.
Confounding was the primary obstacle that caused us to confuse seeing with doing. Having removed this obstacle with the tools of “path blocking” and the back-door criterion, we can now map the routes up Mount Intervention with systematic precision. For the novice climber, the safest routes up the mountain are the back-door adjustment and its various cousins, some going under the rubric of “front-door adjustment” and some under “instrumental variables.”
“universal mapping tool” called the do-calculus,
do-calculus.
calculus.
confounder
Counterfactuals:
Our gift, which may sometimes be a curse, is that we can see what might have been.
Upon finding no headache in that world, we declare the counterfactual statement to be true. “Most similar” is key.
proximate cause
All the above questions require a sensitive ability to tease apart total effects, direct effects (which do not pass through a mediator), and indirect effects (which do).
Bear in mind that this was the heyday of the eugenics movement,
bias and discrimination. Bias is a slippery statistical notion, which may disappear if you slice the data a different way. Discrimination, as a causal concept, reflects reality and must remain stable.
causal mediation at the Uncertainty in Artificial Intelligence conference in Seattle.
back-door criterion