Superforecasting: The Art and Science of Prediction
Rate it:
Open Preview
18%
Flag icon
So if the National Intelligence Estimate said something is “probable,” it would mean a 63% to 87% chance it would happen. Kent’s scheme was simple—and it greatly reduced the room for confusion.
Ed Carmichael
Brilliant- forcing consensus on what certain terms mean
18%
Flag icon
Language has its own poetry, they felt, and it’s tacky to talk explicitly about numerical odds. It makes you sound like a bookie. Kent wasn’t impressed. “I’d rather be a bookie than a goddamn poet,” was his legendary response.
18%
Flag icon
another benefit: vague thoughts are easily expressed with vague language but when forecasters are forced to translate terms like “serious possibility” into numbers, they have to think carefully about how they are thinking, a process known as metacognition.
18%
Flag icon
But a more fundamental obstacle to adopting numbers relates to accountability and what I call the wrong-side-of-maybe fallacy.
18%
Flag icon
But people do judge. And they always judge the same way: they look at which side of “maybe”—50%—the probability was on. If the forecast said there was a 70% chance of rain and it rains, people think the forecast was right; if it doesn’t rain, they think it was wrong. This simple mistake is extremely common.
Ed Carmichael
I made this mistake in the 2016 US presidential elections. Just because the polls gave trump a 10% chance of winning doesn’t means they were wrong
19%
Flag icon
So what’s the safe thing to do? Stick with elastic language. Forecasters who use “a fair chance” and “a serious possibility” can even make the wrong-side-of-maybe fallacy work for them: If the event happens, “a fair chance” can retroactively be stretched to mean something considerably bigger than 50%—so the forecaster nailed it. If it doesn’t happen, it can be shrunk to something much smaller than 50%—and again the forecaster nailed it. With perverse incentives like these, it’s no wonder people prefer rubbery words over firm numbers.
19%
Flag icon
When CIA analysts told President Obama they were “70%” or “90%” sure the mystery man in a Pakistani compound was Osama bin Laden, it was a small, posthumous triumph for Sherman Kent.
19%
Flag icon
vacuous
Ed Carmichael
Mindless
19%
Flag icon
Almost anything “may” happen. I can confidently forecast that the Earth may be attacked by aliens tomorrow. And if it isn’t? I’m not wrong. Every “may” is accompanied by an asterisk and the words “or may not” are buried in the fine print. But the interviewer didn’t notice the fine print in Ferguson’s forecast, so he did not ask him to clarify.
Ed Carmichael
Make it a practice to ask people to quantity their predictions
19%
Flag icon
We cannot rerun history so we cannot judge one probabilistic forecast—but everything changes when we have many probabilistic forecasts. If a meteorologist says there is a 70% chance of rain tomorrow, that forecast cannot be judged, but if she predicts the weather tomorrow, and the day after, and the day after that, for months, her forecasts can be tabulated and her track record determined. If her forecasting is perfect, rain happens 70% of the time when she says there is 70% chance of rain, 30% of the time when she says there is 30% chance of rain, and so on. This is called calibration. It can ...more
Ed Carmichael
The right way to judge predictions/forecasts
19%
Flag icon
If the meteorologist’s curve is far above the line, she is underconfident—so things she says are 20% likely actually happen 50% of the time (see this page). If her curve is far under the line, she is overconfident—so things she says are 80% likely actually happen only 50% of the time (see this page).
19%
Flag icon
Two ways to be miscalibrated: underconfident (over the line) and overconfident (under the line)
Ed Carmichael
It’s not better to be underconfident- it’s still just as wrong as overconfidence
20%
Flag icon
The math behind this system was developed by Glenn W. Brier in 1950, hence results are called Brier scores.
20%
Flag icon
So Brier scores are like golf scores: lower is better.
20%
Flag icon
A forecast that is wrong to the greatest possible extent—saying there is a 100% chance that something will happen and it doesn’t, every time—scores a disastrous 2.0, as far from The Truth as it is possible to get.
Ed Carmichael
Brier scores for forecasting range from 0 - good - to 2 - dead wrong
20%
Flag icon
For example, after the 2012 presidential election, Nate Silver, Princeton’s Sam Wang, and other poll aggregators were hailed for correctly predicting all fifty state outcomes, but almost no one noted that a crude, across-the-board prediction of “no change”—if a state went Democratic or Republican in 2008, it will do the same in 2012—would have scored forty-eight out of fifty, which suggests that the many excited exclamations of “He called all fifty states!” we heard at the time were a tad overwrought.
21%
Flag icon
If you didn’t know the punch line of EPJ before you read this book, you do now: the average expert was roughly as accurate as a dart-throwing chimpanzee.
21%
Flag icon
Hence the old joke about statisticians sleeping with their feet in an oven and their head in a freezer because the average temperature is comfortable.
21%
Flag icon
As a result, they were unusually confident and likelier to declare things “impossible” or “certain.”
Ed Carmichael
What bad forecasters do
21%
Flag icon
These experts gathered as much information from as many sources as they could. When thinking, they often shifted mental gears, sprinkling their speech with transition markers such as “however,” “but,” “although,” and “on the other hand.” They talked about possibilities and probabilities, not certainties. And while no one likes to say “I was wrong,” these experts more readily admitted it and changed their minds.
Ed Carmichael
What good forecasters do
22%
Flag icon
Foxes beat hedgehogs on both calibration and resolution.
Ed Carmichael
The distinction being that calibration requires you to be generally right more often than not and resolution means you make a bold prediction - closer to a certainty and you are right
22%
Flag icon
but he got his start as an economist in the Reagan administration and later worked with Art Laffer, the economist whose theories were the cornerstone of Ronald Reagan’s economic policies. Kudlow’s one Big Idea is supply-side economics. When President George W. Bush followed
22%
Flag icon
He dubbed it “the Bush boom.” Reality fell short: growth and job creation were positive but somewhat disappointing relative to the long-term average and particularly in comparison to that of the Clinton era, which began with a substantial tax hike.
Ed Carmichael
Tax cuts don’t necessarily lead to an economic boom - just compare Clinton v bush
22%
Flag icon
They’re green-tinted glasses—like the glasses that visitors to the Emerald City were required to wear in L. Frank Baum’s The Wonderful Wizard of Oz.
22%
Flag icon
revealed an inverse correlation between fame and accuracy: the more famous an expert was, the less accurate he was.
22%
Flag icon
And so, as EPJ showed, hedgehogs are likelier to say something definitely will or won’t happen. For many audiences, that’s satisfying.
Ed Carmichael
Be VERY wary of media experts for this reason
22%
Flag icon
British scientist Sir Francis Galton
22%
Flag icon
Their average guess—their collective judgment—was 1,197 pounds, one pound short of the correct answer, 1,198 pounds. It was the earliest demonstration of a phenomenon popularized by—and now named for—James Surowiecki’s bestseller The Wisdom of Crowds.
23%
Flag icon
Hundreds of people added valid information, creating a collective pool far greater than any one of them possessed.
23%
Flag icon
Some suggested the correct answer was higher, some lower. So they canceled each other out. With valid information piling up and errors nullifying themselves, the net result was an astonishingly accurate estimate.
Ed Carmichael
The wisdom of crowds works because valid information converges wile errors amongst groups diverge and cancel each other out
23%
Flag icon
And I happen to be one of those highly educated people who is familiar with game theory, so I know 0 is called the Nash equilibrium solution. QED. The only question is who will come with me to London.
Ed Carmichael
Quot erad demonstradum = Latin for this is my final proof; point proven
23%
Flag icon
And because I did not think about the problem from this different perspective, and factor it into my own judgment, I was wrong.
24%
Flag icon
Depending on the species, there may be as many as thirty thousand of these lenses on a single eye, each one occupying a physical space slightly different from those of the adjacent lenses, giving it a unique perspective. Information from these thousands of unique perspectives flows into the dragonfly’s brain where it is synthesized into vision so superb that the dragonfly can see in almost every direction simultaneously, with the clarity and precision it needs to pick off flying insects at high speed.
24%
Flag icon
Why did you assume that an opponent who raises the bet has a strong hand if you would not raise with the same strong hand? “And it’s not until I walk them through the exercise,” Duke says, that people realize they failed to truly look at the table from the perspective of their opponent.
25%
Flag icon
“All models are wrong,” the statistician George Box observed, “but some are useful.”
Ed Carmichael
George Box’s point is that no model can accurately capture reality in all its messiness, but they CAN form a useful approximation
25%
Flag icon
The United States had invaded Afghanistan to oust the Taliban, who had harbored Osama bin Laden.
Ed Carmichael
So the purpose of invading Afghanistan was to root out the taliban
25%
Flag icon
National Intelligence Estimates are the consensus view of the Central Intelligence Agency, the National Security Agency, the Defense Intelligence Agency, and thirteen other agencies. Collectively, these agencies are known as the intelligence community, or IC.
25%
Flag icon
3And this fantastically elaborate, expensive, and experienced intelligence apparatus concluded in October 2002 that the key claims of the Bush administration about Iraqi WMDs were correct.
Ed Carmichael
So even the American intelligence community - the supposedly best and brightest from the CIA and the NSA - were wrong about Iraq having WMDs
25%
Flag icon
Even caustic critics of the Bush administration—people like Tom Friedman, who derisively called them “Bushies”—were convinced Saddam Hussein was hiding something, somewhere.
Ed Carmichael
Even Tom Friedman - an expert on the Middle East - got this wrong it sounds like
25%
Flag icon
What went wrong? One explanation was that the IC had caved to White House bullying. The intelligence had been politicized. But official investigations rejected that claim.
Ed Carmichael
So the intelligence community weren’t pandering to the White House - they genuinely believed that Iraq had WMDs
25%
Flag icon
Why Intelligence Fails,
Ed Carmichael
May be worth reading
25%
Flag icon
So the question “Was the IC’s judgment reasonable?” is challenging. But it’s a snap to answer “Was the IC’s judgment correct?” As I noted in chapter 2, a situation like that tempts us with a bait and switch: replace the tough question with the easy one, answer it, and then sincerely believe that we have answered the tough question.
Ed Carmichael
The invasion of Iraq may NOT have been a bad decision in this light - it ma have been reasonable given the evidence that was available
26%
Flag icon
Jervis did not let the intelligence community off the hook. “There were not only errors but correctable ones,” he wrote about the IC’s analysis. “Analysis could and should have been better.”
Ed Carmichael
Jerks concluded that better analysis could have and should have been done pertaining to Iraq’s purported manufacture of WMDs
26%
Flag icon
So the IC still would have concluded that Saddam had WMDs, they just would have been much less confident in that conclusion. That may sound like a gentle criticism. In fact, it’s devastating, because a less-confident conclusion from the IC may have made a huge difference:
26%
Flag icon
At a White House briefing on December 12, 2002, the CIA director, George Tenet, used the phrase “slam dunk.”
26%
Flag icon
Postmortems even revealed that the IC had never seriously explored the idea that it could be wrong. “There were no ‘red teams’ to attack the prevailing views, no analyses from devil’s advocates, no papers that provided competing possibilities,”
26%
Flag icon
the 1979 failure to foresee the Iranian revolution—the biggest geopolitical disaster of that era—“
26%
Flag icon
DARPA’s work even contributed to the invention of the Internet.
28%
Flag icon
Think how shocking it would be to the intelligence professionals who have spent their lives forecasting geopolitical events—to be beaten by a few hundred ordinary people and some simple algorithms. It actually happened.
28%
Flag icon
Even the extremizing tweak is based on a pretty simple insight: When you combine the judgments of a large group of people to calculate the “wisdom of the crowd” you collect all the relevant information that is dispersed among all those people. But none of those people has access to all that information. One person knows only some of it, another knows some more, and so on. What would happen if every one of those people were given all the information? They would become more confident—raising their forecasts closer to 100% or zero. If you then calculated the “wisdom of the crowd” it too would be ...more