More on this book
Community
Kindle Notes & Highlights
Read between
October 28 - November 30, 2021
Our measurements are always contaminated by noise—radio static, nuisances like flocks of birds, harmless cysts that show up on the scan—and they, too, will vary from measurement to measurement, falling into their own bell curve.
When we are forced to guess whether an observation is a signal (reflecting something real) or noise (the messiness in our observations), we have to apply a cutoff. In the jargon of signal detection, it’s called the criterion or response bias, symbolized as β (beta). If an observation is above the criterion, we say “Yes,” acting as if it is a signal (whether or not it is, which we can’t know); if it is below, we say “No,” acting as if it is noise:
In that case the catastrophic cost of a false alarm would call for being absolutely sure you are being attacked before responding, which means setting the response criterion very, very high. Also relevant are the base rates of the bombers and seagulls that trigger those blips (the Bayesian priors). If seagulls were common but bombers rare, it would call for a high criterion (not jumping the gun), and vice versa.
So where, exactly, should a rational decision maker—an “ideal observer,” in the lingo of the theory—place the criterion? The answer is: at the point that would maximize the observer’s expected utility.
An ideal observer would set her criterion higher (need better evidence before saying “Yes”) to the degree that noise is likelier than a signal (a low Bayesian prior). It’s common sense: if signals are rare, you should say “Yes” less often. She should also set a higher bar when the payoffs for hits are lower or for correct rejections are higher,
if you’re paying big fines for false alarms, you should be more chary of saying “Yes,” but if you’re getting windfalls for hits, you should be more keen.
even with a crude sense of which costs are monstrous and which bearable, can make the decisions more consistent and justifiable.
how far apart the signal and noise distributions are, called the “sensitivity,” symbolized as dʹ, pronounced “d-prime.”
Enhancing sensitivity should always be our aspiration in signal detection challenges,
The most notorious is eyewitness testimony: research by Elizabeth Loftus and other cognitive psychologists has shown that people routinely and confidently recall seeing things that never happened.
some percentage of DNA testimony is corrupted by contaminated samples, botched labels, and other human error.
As the jurist William Blackstone (1723–1780) put it in his eponymous rule, “It is better that ten guilty persons escape than that one innocent suffer.” And so juries in criminal trials make a “presumption of innocence,” and may convict only if the defendant is “guilty beyond a reasonable doubt” (a high setting for β, the criterion or response bias). They may not convict based on a mere “preponderance of the evidence,” also known as “fifty percent plus a feather.”
Punishing the innocent, particularly by death, shocks the conscience in a way that failing to punish the guilty does not.
How strong would the evidence have to be to meet those targets? To be precise, how large does dʹ have to be, namely the distance between the distributions for the signal (guilty) and the noise (innocent)? The distance may be measured in standard deviations, the most common estimate of variability. (Visually it corresponds to the width of the bell curve, that is, the horizontal distance from the mean to the inflection point, where convex shifts to concave.)
But of mathematical necessity, lowering the response criterion can only trade one kind of injustice for another.
But being mindful of the tragic tradeoffs in distinguishing signals from noise can bring greater justice. It forces us to face the enormity of harsh punishments like the death penalty and long sentences, which are not only cruel to the guilty but inevitably will be visited upon the innocent. And it tells us that the real quest for justice should consist of increasing the sensitivity of the system, not its bias:
Every statistics student is warned that “statistical significance” is a technical concept that should not be confused with “significance” in the vernacular sense of noteworthy or consequential.
like the difference in symptoms between the group that got the drug and the group that got the placebo, or the difference in verbal skills between boys and girls, or the improvement in test scores after students enrolled in an enrichment program. If the number is zero, it means there’s no effect; greater than zero, a possible eureka.
the distribution of scores that the scientist would obtain if there’s no difference in reality, called the null hypothesis, and the distribution of scores she would obtain if something is happening, an effect of a given size. The distributions overlap—that’s what makes science hard.
The null hypothesis is the noise; the alternative hypothesis is the signal. The size of the effect is like the sensitivity, and it determines how easy it is to tell signal from noise. The scientist needs to apply some criterion or response bias before breaking out the champagne, called the critical value: below the critical value, she fails to reject the null hypothesis and drowns her sorrows; above the critical value, she rejects it and celebrates—she declares the effect to be “statistically significant.”
She could reject the null hypothesis when it is true, namely a false alarm, or in the argot of statistical decision theory, a Type I error. Or she could fail to reject the null hypothesis when it is false—a miss, or in the patois, a Type II error. Both are bad:
Now, deep in the mists of time it was decided—it’s not completely clear by whom—that a Type I error (proclaiming an effect when there is none) is especially damaging to the scientific enterprise, which can tolerate only a certain number of them: 5 percent of the studies in which the null hypothesis is true, to be exact. And so the convention arose that scientists should adopt a critical level that ensures that the probability of rejecting the null hypothesis when it is true is less than 5 percent: the coveted “p < .05.”
That’s what “statistical significance” means: it’s a way to keep the proportion of false claims of discoveries beneath an arbitrary cap.
The scientist cannot use a significance test to assess whether the null hypothesis is true or false unless she also considers the prior—her best guess of the probability that the null hypothesis is true before doing the experiment. And in the mathematics of null hypothesis significance testing, a Bayesian prior is nowhere to be found.
Bayesian reasoning can adjust our credence in the truth, but it must begin with a prior, with all the subjective judgment that goes into it.
game theory, the analysis of how to make rational choices when the payoffs depend on someone else’s rational choices.
game theory deals with dilemmas that pit us against equally cunning deciders, and the outcomes can turn our intuitions upside down and sideways.
Game theory unveils the strange rationality beneath many of the perversities of social and political life, and as we will see in a later chapter, it helps explain the central mystery of this book: how a rational species can be so irrational.
The crucial technique in game theory (and indeed in life) is to see the world from the other player’s point of view.
Amanda’s best strategy is to turn herself into a human roulette wheel and play each move at random with the same probability, stifling any skew, tilt, lean, or drift away from a perfect ⅓–⅓–⅓ split.
Each is playing the best strategy given the opponent’s best strategy; any unilateral change would make them worse off.
Even when a move is not literally picked at random (presumably in 1944 the Allies did not roll a die before deciding whether to invade Normandy or Calais), the player must assume a poker face and suppress any tell or leak, making the choice appear random to their opponents.
Here again they are in a Nash equilibrium, a standoff in in which all the players stick with their best choice in response to the others’ best choices.
What they need is common knowledge, which in game theory is a technical term referring to something that each one knows that the other knows that they know, ad infinitum.
One of the most commonly cited human irrationalities is the sunk-cost fallacy, in which people continue to invest in a losing venture because of what they have invested so far rather than in anticipation of what they will gain going forward. Holding on to a tanking stock, sitting through a boring movie, finishing a tedious novel, and staying in a bad marriage are familiar examples.
The common rationale is “We fight so that our boys will not have died in vain,” a textbook example of the sunk-cost fallacy but also a tactic in the pathetic quest for a Pyrrhic victory.
Though persisting with a certain probability may be the least bad option once one is trapped in an Escalation Game, the truly rational strategy is not to play in the first place.
The Prisoner’s Dilemma has no solution, but the rules of the game can be changed. One way is for the players to enter enforceable agreements before playing, or to submit to the rule of an authority, which change the payoffs by adding a reward for cooperation or a penalty for defection.
Everyone in a community benefits from a public good such as a lighthouse, roads, sewers, police, and schools. But they benefit even more if everyone else pays for them and they are free riders—once
In a poignant environmental version called the Tragedy of the Commons, every shepherd has an incentive to add one more sheep to his flock and graze it on the town commons, but when everyone fattens their flock, the grass is grazed faster than it can regrow, and all the sheep starve. Traffic and pollution work the same way:
Just as an enforceable oath can spare the prisoners in a two-person Dilemma from mutual defection, enforceable laws and contracts can punish people for their own mutual good in a Public Goods game.
commons in a community where everyone knows everyone else can be protected by a multiplayer version of Tit for Tat: any exploiter of a resource becomes a target of gossip, shaming, veiled threats, and discreet vandalism.21 In larger and more anonymous communities, changes to the payoffs must be made by enforceable contracts and regulations. And so we pay taxes for roads, schools, and a court system, with evaders sent to jail. Ranchers buy grazing permits, and fishers respect limits on their catch, as long as they know they’re being enforced on the other guy, too.
The logic of Prisoner’s Dilemmas and Public Goods undermines anarchism and radical libertarianism, despite the eternal appeal of unfettered freedom.
One of the first things taught in introductory statistics textbooks is that correlation is not causation. It is also one of the first things forgotten. —Thomas Sowell
The concept of causation, and its contrast with mere correlation, is the lifeblood of science.
This is the mathematical technique called regression, the workhorse of epidemiology and social science.
The rod settles into a location and an angle that minimizes the square of the distance between each tack and where it’s attached. The rod, thus positioned, is called a regression line, and it captures the linear relationship between the two variables: y, corresponding to the vertical axis, and x, corresponding to the horizontal one. The length of the rubber band connecting each tack to the line is called the residual,
If income predicted happiness perfectly, every dot would fall exactly along the gray regression line, but with real data, that never happens. Some of the dots float above the line
r, the correlation coefficient, which ranges from –1
through 0, when they are an uncorrelated swarm of gnats; through positive values where they splatter southwest to northeast; to 1, where they lie perfectly along the diagonal.

