More on this book
Community
Kindle Notes & Highlights
Started reading
May 27, 2023
Bill has answered roughly three hundred questions like “Will Russia officially annex additional Ukrainian territory in the next three months?” and “In the next year, will any country withdraw from the eurozone?”
Some of the questions are downright obscure, at least for most of us.
When Bill first sees one of these questions, he may have no clue how to answer
He gathers facts, balances clashing arguments, and settles on an answer.
We know that because each one of Bill’s predictions has been dated, recorded, and assessed for accuracy by independent scientific observers. His track record is excellent.
I call them superforecasters because that is what they are. Reliable evidence proves it. Explaining why they’re so good, and how others can learn to do what they do, is my goal in this book.
The one undeniable talent that talking heads have is their skill at telling a compelling story with conviction, and that is enough.
It was easiest to beat chance on the shortest-range questions that only required looking one year out, and accuracy fell off the further out experts tried to forecast—approaching the dart-throwing-chimpanzee level three to five years out.
Often the connections are harder to spot, but they are all around us, in things like the price we pay at the gas station or the layoffs down the street.
How predictable something is depends on what we are trying to predict, how far into the future, and under what circumstances.
Accuracy is seldom determined after the fact and is almost never done with sufficient regularity and rigor that conclusions can be drawn. The reason? Mostly it’s a demand-side problem: The consumers of forecasting—governments, business, and the public—don’t demand evidence of accuracy. So there is no measurement. Which means no revision.
“I have been struck by how important measurement is to improving the human condition,” Bill Gates wrote. “You can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal….This may seem basic, but it is amazing how often it is not done and how hard it is to get right.”
IARPA is an agency within the intelligence community that reports to the director of National Intelligence and its job is to support daring research that promises to make American intelligence better at what it does.
but they do have a real, measurable skill at judging how high-stakes events are likely to unfold three months, six months, a year, or a year and a half in advance.
Foresight isn’t a mysterious gift bestowed at birth. It is the product of particular ways of thinking, of gathering information, of updating beliefs.
bet and a 40/60 bet. And yet, if it’s possible to improve foresight simply by measuring, and if the rewards of improved foresight are substantial, why isn’t measuring standard practice? A big part of the answer to that question lies in the psychology that convinces us we know things we really don’t—things like whether Tom Friedman is an accurate forecaster or not.
When physicians finally accepted that their experience and perceptions were not reliable means of determining whether a treatment works, they turned to scientific testing—and medicine finally started to make rapid advances. The same revolution needs to happen in forecasting.
As with the experts who had real foresight in my earlier research, what matters most is how the forecaster thinks.
superforecasting demands thinking that is open-minded, careful, curious, and—above all—self-critical. It also demands focus. The kind of thinking that produces superior judgment does not come effortlessly. Only the determined can deliver it reasonably consistently, which is why our analyses have consistently found commitment to self-improvement to be the strongest predictor of performance.
The point is now indisputable: when you have a well-validated statistical algorithm, use it.
“mimicking human meaning,” and thereby better at predicting human behavior, but “there’s a difference between mimicking
That’s a space human judgment will always occupy.
As Daniel Kahneman puts it, “System 1 is designed to jump to conclusions from little evidence.”13
Progress only really began when physicians accepted that the view from the tip of their nose was not enough to determine what works.
The first step in learning what works in forecasting, and what doesn’t, is to judge forecasts, and to do that we can’t make assumptions about what the forecast means. We have to know. There can’t be any ambiguity about whether a forecast is accurate or not and Ballmer’s forecast is ambiguous. Sure, it looks wrong. It
It is far from unusual that a forecast that at first looks as clear as a freshly washed window proves too opaque to be conclusively judged right or wrong.
Judging forecasts is much harder than often supposed, a lesson I learned the hard way—from extensive and exasperating experience.
That’s why forecasts without timelines don’t appear absurd when they are made.
This sort of vague verbiage is more the rule than the exception. And it too renders forecasts untestable.
Hence forecasting is all about estimating the likelihood of something happening,
But it was never adopted. People liked clarity and precision in principle but when it came time to make clear and precise forecasts they weren’t so keen on numbers.
A more serious objection—then and now—is that expressing a probability estimate with a number may imply to the reader that it is an objective fact,
It’s to inform readers that numbers, just like words, only express estimates—opinions—and nothing more.
Also, bear in mind that words like “serious possibility” suggest the same thing numbers do, the only real difference being that numbers make it explicit, reducing the risk of confusion.
metacognition.
But a more fundamental obstacle to adopting numbers relates to accountability and what I call the wrong-side-of-maybe fallacy.
But people do judge. And they always judge the same way: they look at which side of “maybe”—50%—the probability was on. If the forecast said there was a 70% chance of rain
So what’s the safe thing to do? Stick with elastic language.
With perverse incentives like these, it’s no wonder people prefer rubbery words over firm numbers.
Every “may” is accompanied by an asterisk and the words “or may not” are buried in the fine print. But the interviewer didn’t notice the fine print in Ferguson’s forecast, so he did not ask him to clarify.16
They must use numbers. And one more thing is essential: we must have lots of forecasts.
If a meteorologist says there is a 70% chance of rain tomorrow, that forecast cannot be judged, but if she predicts the weather tomorrow, and the day after, and the day after that, for months, her forecasts can be tabulated and her track record determined. If her forecasting is perfect, rain happens 70% of the time when she says there is 70% chance of rain, 30% of the time when she says there is 30% chance of rain, and so on. This is called calibration. It can be plotted on a simple chart. Perfect calibration is captured by the diagonal line on this chart:
We could focus on the state level in presidential elections, for example, which would give us fifty results per election,