More on this book
Community
Kindle Notes & Highlights
Read between
May 11 - August 26, 2020
Language has its own poetry, they felt, and it’s tacky to talk explicitly about numerical odds. It makes you sound like a bookie. Kent wasn’t impressed. “I’d rather be a bookie than a goddamn poet,” was his legendary response.
another benefit: vague thoughts are easily expressed with vague language but when forecasters are forced to translate terms like “serious possibility” into numbers, they have to think carefully about how they are thinking, a process known as metacognition.
But a more fundamental obstacle to adopting numbers relates to accountability and what I call the wrong-side-of-maybe fallacy.
But people do judge. And they always judge the same way: they look at which side of “maybe”—50%—the probability was on. If the forecast said there was a 70% chance of rain and it rains, people think the forecast was right; if it doesn’t rain, they think it was wrong. This simple mistake is extremely common.
I made this mistake in the 2016 US presidential elections. Just because the polls gave trump a 10% chance of winning doesn’t means they were wrong
So what’s the safe thing to do? Stick with elastic language. Forecasters who use “a fair chance” and “a serious possibility” can even make the wrong-side-of-maybe fallacy work for them: If the event happens, “a fair chance” can retroactively be stretched to mean something considerably bigger than 50%—so the forecaster nailed it. If it doesn’t happen, it can be shrunk to something much smaller than 50%—and again the forecaster nailed it. With perverse incentives like these, it’s no wonder people prefer rubbery words over firm numbers.
When CIA analysts told President Obama they were “70%” or “90%” sure the mystery man in a Pakistani compound was Osama bin Laden, it was a small, posthumous triumph for Sherman Kent.
Almost anything “may” happen. I can confidently forecast that the Earth may be attacked by aliens tomorrow. And if it isn’t? I’m not wrong. Every “may” is accompanied by an asterisk and the words “or may not” are buried in the fine print. But the interviewer didn’t notice the fine print in Ferguson’s forecast, so he did not ask him to clarify.
We cannot rerun history so we cannot judge one probabilistic forecast—but everything changes when we have many probabilistic forecasts. If a meteorologist says there is a 70% chance of rain tomorrow, that forecast cannot be judged, but if she predicts the weather tomorrow, and the day after, and the day after that, for months, her forecasts can be tabulated and her track record determined. If her forecasting is perfect, rain happens 70% of the time when she says there is 70% chance of rain, 30% of the time when she says there is 30% chance of rain, and so on. This is called calibration. It can
...more
If the meteorologist’s curve is far above the line, she is underconfident—so things she says are 20% likely actually happen 50% of the time (see this page). If her curve is far under the line, she is overconfident—so things she says are 80% likely actually happen only 50% of the time (see this page).
The math behind this system was developed by Glenn W. Brier in 1950, hence results are called Brier scores.
So Brier scores are like golf scores: lower is better.
For example, after the 2012 presidential election, Nate Silver, Princeton’s Sam Wang, and other poll aggregators were hailed for correctly predicting all fifty state outcomes, but almost no one noted that a crude, across-the-board prediction of “no change”—if a state went Democratic or Republican in 2008, it will do the same in 2012—would have scored forty-eight out of fifty, which suggests that the many excited exclamations of “He called all fifty states!” we heard at the time were a tad overwrought.
If you didn’t know the punch line of EPJ before you read this book, you do now: the average expert was roughly as accurate as a dart-throwing chimpanzee.
Hence the old joke about statisticians sleeping with their feet in an oven and their head in a freezer because the average temperature is comfortable.
These experts gathered as much information from as many sources as they could. When thinking, they often shifted mental gears, sprinkling their speech with transition markers such as “however,” “but,” “although,” and “on the other hand.” They talked about possibilities and probabilities, not certainties. And while no one likes to say “I was wrong,” these experts more readily admitted it and changed their minds.
but he got his start as an economist in the Reagan administration and later worked with Art Laffer, the economist whose theories were the cornerstone of Ronald Reagan’s economic policies. Kudlow’s one Big Idea is supply-side economics. When President George W. Bush followed
He dubbed it “the Bush boom.” Reality fell short: growth and job creation were positive but somewhat disappointing relative to the long-term average and particularly in comparison to that of the Clinton era, which began with a substantial tax hike.
They’re green-tinted glasses—like the glasses that visitors to the Emerald City were required to wear in L. Frank Baum’s The Wonderful Wizard of Oz.
revealed an inverse correlation between fame and accuracy: the more famous an expert was, the less accurate he was.
British scientist Sir Francis Galton
Their average guess—their collective judgment—was 1,197 pounds, one pound short of the correct answer, 1,198 pounds. It was the earliest demonstration of a phenomenon popularized by—and now named for—James Surowiecki’s bestseller The Wisdom of Crowds.
Hundreds of people added valid information, creating a collective pool far greater than any one of them possessed.
Some suggested the correct answer was higher, some lower. So they canceled each other out. With valid information piling up and errors nullifying themselves, the net result was an astonishingly accurate estimate.
The wisdom of crowds works because valid information converges wile errors amongst groups diverge and cancel each other out
And because I did not think about the problem from this different perspective, and factor it into my own judgment, I was wrong.
Depending on the species, there may be as many as thirty thousand of these lenses on a single eye, each one occupying a physical space slightly different from those of the adjacent lenses, giving it a unique perspective. Information from these thousands of unique perspectives flows into the dragonfly’s brain where it is synthesized into vision so superb that the dragonfly can see in almost every direction simultaneously, with the clarity and precision it needs to pick off flying insects at high speed.
Why did you assume that an opponent who raises the bet has a strong hand if you would not raise with the same strong hand? “And it’s not until I walk them through the exercise,” Duke says, that people realize they failed to truly look at the table from the perspective of their opponent.
National Intelligence Estimates are the consensus view of the Central Intelligence Agency, the National Security Agency, the Defense Intelligence Agency, and thirteen other agencies. Collectively, these agencies are known as the intelligence community, or IC.
3And this fantastically elaborate, expensive, and experienced intelligence apparatus concluded in October 2002 that the key claims of the Bush administration about Iraqi WMDs were correct.
So even the American intelligence community - the supposedly best and brightest from the CIA and the NSA - were wrong about Iraq having WMDs
So the question “Was the IC’s judgment reasonable?” is challenging. But it’s a snap to answer “Was the IC’s judgment correct?” As I noted in chapter 2, a situation like that tempts us with a bait and switch: replace the tough question with the easy one, answer it, and then sincerely believe that we have answered the tough question.
The invasion of Iraq may NOT have been a bad decision in this light - it ma have been reasonable given the evidence that was available
Jervis did not let the intelligence community off the hook. “There were not only errors but correctable ones,” he wrote about the IC’s analysis. “Analysis could and should have been better.”
Jerks concluded that better analysis could have and should have been done pertaining to Iraq’s purported manufacture of WMDs
So the IC still would have concluded that Saddam had WMDs, they just would have been much less confident in that conclusion. That may sound like a gentle criticism. In fact, it’s devastating, because a less-confident conclusion from the IC may have made a huge difference:
At a White House briefing on December 12, 2002, the CIA director, George Tenet, used the phrase “slam dunk.”
Postmortems even revealed that the IC had never seriously explored the idea that it could be wrong. “There were no ‘red teams’ to attack the prevailing views, no analyses from devil’s advocates, no papers that provided competing possibilities,”
the 1979 failure to foresee the Iranian revolution—the biggest geopolitical disaster of that era—“
DARPA’s work even contributed to the invention of the Internet.
Think how shocking it would be to the intelligence professionals who have spent their lives forecasting geopolitical events—to be beaten by a few hundred ordinary people and some simple algorithms. It actually happened.
Even the extremizing tweak is based on a pretty simple insight: When you combine the judgments of a large group of people to calculate the “wisdom of the crowd” you collect all the relevant information that is dispersed among all those people. But none of those people has access to all that information. One person knows only some of it, another knows some more, and so on. What would happen if every one of those people were given all the information? They would become more confident—raising their forecasts closer to 100% or zero. If you then calculated the “wisdom of the crowd” it too would be
...more