Superforecasting: The Art and Science of Prediction
Rate it:
Open Preview
0%
Flag icon
These are ordinary people, from former ballroom dancers to retired computer programmers, who are nonetheless able to predict the future with a 60% greater degree of accuracy than regular forecasters. They are superforecasters.
0%
Flag icon
They show the methods used by these superforecasters which enable them to outperform even professional intelligence analysts with access to classified data.
1%
Flag icon
Explaining why they’re so good, and how others can learn to do what they do, is my goal in this book.
2%
Flag icon
But I realized that as word of my work spread, its apparent meaning was mutating. What my research had shown was that the average expert had done little better than guessing on many of the political and economic questions I had posed. “Many” does not equal all. It was easiest to beat chance on the shortest-range questions that only required looking one year out, and accuracy fell off the further out experts tried to forecast—approaching the dart-throwing-chimpanzee level three to five years out.
2%
Flag icon
I believe it is possible to see into the future, at least in some situations and to some extent, and that any intelligent, open-minded, and hardworking person can cultivate the requisite skills.
2%
Flag icon
But this particular humiliation, on December 17, 2010, caused Mohamed Bouazizi, aged twenty-six, to set himself on fire, and Bouazizi’s self-immolation sparked protests. The police responded with typical brutality. The protests spread. Hoping to assuage the public, the dictator of Tunisia, President Zine el-Abidine Ben Ali, visited Bouazizi in the hospital. Bouazizi died on January 4, 2011. The unrest grew. On January 14, Ben Ali fled to a cushy exile in Saudi Arabia, ending his twenty-three-year kleptocracy.
3%
Flag icon
In 1972 the American meteorologist Edward Lorenz wrote a paper with an arresting title: “Predictability: Does the Flap of a Butterfly’s Wings in Brazil Set Off a Tornado in Texas?
3%
Flag icon
This is a big reason for the “skeptic” half of my “optimistic skeptic” stance. We live in a world where the actions of one nearly powerless man can have ripple effects around the world—ripples that affect us all to varying degrees.
4%
Flag icon
How predictable something is depends on what we are trying to predict, how far into the future, and under what circumstances.
4%
Flag icon
Accuracy is seldom determined after the fact and is almost never done with sufficient regularity and rigor that conclusions can be drawn. The reason? Mostly it’s a demand-side problem: The consumers of forecasting—governments, business, and the public—don’t demand evidence of accuracy. So there is no measurement. Which means no revision.
5%
Flag icon
“I have been struck by how important measurement is to improving the human condition,” Bill Gates wrote. “You can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal. … This may seem basic, but it is amazing how often it is not done and how hard it is to get right.”8 He is right about what it takes to drive progress, and it is surprising how rarely it’s done in forecasting. Even that simple first step—setting a clear goal—hasn’t been taken.
5%
Flag icon
We know that in so much of what people want to predict—politics, economics, finance, business, technology, daily life—predictability exists, to some degree, in some circumstances.
5%
Flag icon
By varying the experimental conditions, we could gauge which factors improved foresight, by how much, over which time frames, and how good forecasts could become if best practices were layered on each other. Laid out like that, it sounds simple. It wasn’t. It was a demanding program that took the talents and hard work of a multidisciplinary team based at the University of California, Berkeley, and the University of Pennsylvania.
5%
Flag icon
And a big part of what American intelligence does is forecast global political and economic trends.
5%
Flag icon
To change that, IARPA created a forecasting tournament in which five scientific teams led by top researchers in the field would compete to generate accurate forecasts on the sorts of tough questions intelligence analysts deal with every day.
5%
Flag icon
Over four years, IARPA posed nearly five hundred questions about world affairs. Time frames were shorter than in my earlier research, with the vast majority of forecasts extending out more than one month and less than one year. In all, we gathered over one million individual judgments about the future.
5%
Flag icon
and even outperformed professional intelligence analysts with access to classified data.
6%
Flag icon
Foresight isn’t a mysterious gift bestowed at birth. It is the product of particular ways of thinking, of gathering information, of updating beliefs. These habits of thought can be learned and cultivated by any intelligent, thoughtful, determined person. It may not even be all that hard to get started.
6%
Flag icon
One result that particularly surprised me was the effect of a tutorial covering some basic concepts that we’ll explore in this book and are summarized in the Ten Commandments appendix.
6%
Flag icon
In the late 1980s I worked out a methodology and conducted what was, at the time, the biggest test of expert political forecasting accuracy ever. One result, delivered many years later, was the punch line that now makes me squirm. But another discovery of that research didn’t receive nearly as much attention even though it was far more important: one group of experts had modest but real foresight. What made the difference between the experts with foresight and those who were so hopeless they dragged the average down to the level of a dart-throwing chimp? It wasn’t some mystical gift or access ...more
6%
Flag icon
Superforecasting does require minimum levels of intelligence, numeracy, and knowledge of the world, but anyone who reads serious books about psychological research probably has those prerequisites. So what is it that elevates forecasting to superforecasting? As with the experts who had real foresight in my earlier research, what matters most is how the forecaster thinks. I’ll describe this in detail, but broadly speaking, superforecasting demands thinking that is open-minded, careful, curious, and—above all—self-critical. It also demands focus. The kind of thinking that produces superior ...more
6%
Flag icon
Only the determined can deliver it reasonably consistently, which is why our analyses have consistently found commitment to self-improvement to be the strongest predictor of performance.
6%
Flag icon
Meehl’s claim upset many experts, but subsequent research—now more than two hundred studies—has shown that in most cases statistical algorithms beat subjective judgment, and in the handful of studies where they don’t, they usually tie.
6%
Flag icon
In 1997 IBM’s Deep Blue beat chess champion Garry Kasparov. Now, commercially available chess programs can beat any human. In 2011 IBM’s Watson beat Jeopardy! champions Ken Jennings and Brad Rutter.
7%
Flag icon
What Ferrucci does see becoming obsolete is the guru model that makes so many policy debates so puerile: “I’ll counter your Paul Krugman polemic with my Niall Ferguson counterpolemic, and rebut your Tom Friedman op-ed with my Bret Stephens blog.” Ferrucci sees light at the end of this long dark tunnel: “I think it’s going to get stranger and stranger” for people to listen to the advice of experts whose views are informed only by their subjective judgment. Human thought is beset by psychological pitfalls, a fact that has only become widely recognized in the last decade or two. “So what I want ...more
7%
Flag icon
This melancholy moment happened in 1956 but the patient, Archie Cochrane, did not die, which is fortunate because he went on to become a revered figure in medicine.
7%
Flag icon
We have all been too quick to make up our minds and too slow to change them.
8%
Flag icon
The cure for this plague of certainty came tantalizingly close to discovery in 1747, when a British ship’s doctor named James Lind took twelve sailors suffering from scurvy, divided them into pairs, and gave each pair a different treatment: vinegar, cider, sulfuric acid, seawater, a bark paste, and citrus fruit. It was an experiment born of desperation. Scurvy was a mortal threat to sailors on long-distance voyages and not even the confidence of physicians could hide the futility of their treatments. So Lind took six shots in the dark—and one hit. The two sailors given the citrus recovered ...more
8%
Flag icon
“Is the application of the numerical method to the subject-matter of medicine a trivial and time-wasting ingenuity as some hold, or is it an important stage in the development of our art, as others proclaim it,” the Lancet asked in 1921. The British statistician Austin Bradford Hill responded emphatically that it was the latter, and laid out a template for modern medical investigation.
10%
Flag icon
In describing how we think and decide, modern psychologists often deploy a dual-system model that partitions our mental universe into two domains. System 2 is the familiar realm of conscious thought. It consists of everything we choose to focus on. By contrast, System 1 is largely a stranger to us. It is the realm of automatic perceptual and cognitive operations—like those you are running right now to transform the print on this page into a meaningful sentence or to hold the book while reaching for a glass and taking a sip. We have no awareness of these rapid-fire processes but we could not ...more
10%
Flag icon
That’s the “availability heuristic,” one of many System 1 operations—or heuristics—discovered by Daniel Kahneman, his collaborator Amos Tversky, and other researchers in the fast-growing science of judgment and choice.
10%
Flag icon
A defining feature of intuitive judgment is its insensitivity to the quality of the evidence on which the judgment is based.
10%
Flag icon
These tacit assumptions are so vital to System 1 that Kahneman gave them an ungainly but oddly memorable label: WYSIATI (What You See Is All There Is).14
11%
Flag icon
This compulsion to explain arises with clocklike regularity every time a stock market closes and a journalist says something like “The Dow rose ninety-five points today on news that …” A quick check will often reveal that the news that supposedly drove the market came out well after the market had risen. But that minimal level of scrutiny is seldom applied. It’s a rare day when a journalist says, “The market rose today for any one of a hundred different reasons, or a mix of them, so no one knows.
11%
Flag icon
Scientists must be able to answer the question “What would convince me I am wrong?” If they can’t, it’s a sign they have grown too attached to their beliefs.
12%
Flag icon
Formally, it’s called attribute substitution, but I call it bait and switch: when faced with a hard question, we often surreptitiously replace it with an easy one.
12%
Flag icon
So the availability heuristic—like Kahneman’s other heuristics—is essentially a bait-and-switch maneuver. And just as the availability heuristic is usually an unconscious System 1 activity, so too is bait and switch.
12%
Flag icon
The instant we wake up and look past the tip of our nose, sights and sounds flow to the brain and System 1 is engaged. This perspective is subjective, unique to each of us. Only you can see the world from the tip of your own nose. So let’s call it the tip-of-your-nose perspective.
12%
Flag icon
Drawing such seemingly different conclusions about snap judgments, Kahneman and Klein could have hunkered down and fired off rival polemics. But, like good scientists, they got together to solve the puzzle. “We agree on most of the issues that matter,” they concluded in a 2009 paper.
12%
Flag icon
an infant will soon show obvious symptoms of infection,” Kahneman and Klein wrote. “On the other hand, it is unlikely that there is publicly available information that could be used to predict how well a particular stock will do—if such valid information existed, the price of the stock would already reflect it. Thus, we have more reason to trust the intuition of an experienced fireground commander about the stability of a building, or the intuitions of a nurse about an infant, than to trust the intuitions of a stock broker.”
13%
Flag icon
All too often, forecasting in the twenty-first century looks too much like nineteenth-century medicine. There are theories, assertions, and arguments. There are famous figures, as confident as they are well compensated. But there is little experimentation, or anything that could be called science, so we know much less than most people realize. And we pay the price. Although bad forecasting rarely leads as obviously to harm as does bad medicine, it steers us subtly toward bad decisions and all that flows from them—including monetary losses, missed opportunities, unnecessary suffering, even war ...more
14%
Flag icon
Consider the open letter sent to Ben Bernanke, then the chairman of the Federal Reserve, in November 2010. Signed by a long list of economists and commentators, including the Harvard economic historian Niall Ferguson and Amity Shlaes of the Council on Foreign Relations, the letter calls on the Federal Reserve to stop its policy of large-scale asset purchases known as “quantitative easing” because it “risk[s] currency debasement and inflation.
15%
Flag icon
At lunch one day in 1988, my then–Berkeley colleague Daniel Kahneman tossed out a testable idea that proved prescient. He speculated that intelligence and knowledge would improve forecasting but the benefits would taper off fast.
15%
Flag icon
In intelligence circles, Sherman Kent is a legend. With a PhD in history, Kent left a faculty position at Yale to join the Research and Analysis Branch of the newly created Coordinator of Information (COI) in 1941. The COI became the Office of Strategic Services (OSS). The OSS became the Central Intelligence Agency (CIA). By the time Kent retired from the CIA in 1967, he had profoundly shaped how the American intelligence community does what it calls intelligence analysis—the methodical examination of the information collected by spies and surveillance to figure out what it means, and what ...more
16%
Flag icon
would invade. In March 1951 National Intelligence Estimate (NIE) 29-51 was published. “Although it is impossible to determine which course of action the Kremlin is likely to adopt,” the report concluded, “we believe that the extent of [Eastern European] military and propaganda preparations indicates that an attack on Yugoslavia in 1951 should be considered a serious possibility.” By most standards, that is clear, meaningful language. No one suggested otherwise when the estimate was published and read by top officials throughout the government. But a few days later, Kent was chatting with a ...more
16%
Flag icon
Kent was right to worry. In 1961, when the CIA was planning to topple the Castro government by landing a small army of Cuban expatriates at the Bay of Pigs, President John F. Kennedy turned to the military for an unbiased assessment. The Joint Chiefs of Staff concluded that the plan had a “fair chance” of success.
17%
Flag icon
In 2012, when the Supreme Court was about to release its long-awaited decision on the constitutionality of Obamacare, prediction markets—markets that let people bet on possible outcomes—pegged the probability of the law being struck down at 75%. When the court upheld the law, the sagacious New York Times reporter David Leonhardt declared that “the market—the wisdom of the crowds—was wrong.
17%
Flag icon
If a meteorologist says there is a 70% chance of rain tomorrow, that forecast cannot be judged, but if she predicts the weather tomorrow, and the day after, and the day after that, for months, her forecasts can be tabulated and her track record determined. If her forecasting is perfect, rain happens 70% of the time when she says there is 70% chance of rain, 30% of the time when she says there is 30% chance of rain, and so on. This is called calibration. It can be plotted on a simple chart. Perfect calibration is captured by the diagonal line on this chart:
18%
Flag icon
But it’s poor resolution because the forecaster never strays out of the minor-shades-of-maybe zone between 40% and 60%.
18%
Flag icon
The math behind this system was developed by Glenn W. Brier in 1950, hence results are called Brier scores. In effect, Brier scores measure the distance between what you forecast and what actually happened. So Brier scores are like golf scores: lower is better. Perfection is 0. A hedged fifty-fifty call, or random guessing in the aggregate, will produce a Brier score of 0.5. A forecast that is wrong to the greatest possible extent—saying there is a 100% chance that something will happen and it doesn’t, every time—scores a disastrous 2.0, as far from The Truth as it is possible to get.
« Prev 1 3 4