Superforecasting: The Art and Science of Prediction
Rate it:
Open Preview
28%
Flag icon
Remember that the Brier score measures the gap between forecasts and reality, where 2.0 is the result if your forecasts are the perfect opposite of reality, 0.5 is what you would get by random guessing, and 0 is the center of the bull’s-eye.
30%
Flag icon
Students who started off with a string of hits had a higher opinion of their skill and thought they would shine again. Langer called this the “illusion of control,” but it is also an “illusion of prediction.”
30%
Flag icon
Even a dart-throwing chimp will hit the occasional bull’s-eye if he throws enough darts, and anyone can easily “predict” the next stock market crash by incessantly warning that the stock market is about to crash.
30%
Flag icon
Think of a lottery winner. It is fantastically unlikely that one particular ticket will win a major lottery, often one in many millions, but we don’t conclude that lottery winners are highly skilled ticket-pickers—because we know there are millions of tickets sold, which makes it highly likely that someone, somewhere, will win.
30%
Flag icon
Michael Mauboussin, a global financial strategist, in his book The Success Equation. But as Mauboussin noted, there is an elegant rule of thumb that applies to athletes and CEOs, stock analysts and superforecasters. It involves “regression to the mean.”
30%
Flag icon
Imagine that we knew everyone’s height and computed the correlation between the heights of fathers and sons. We would find a strong but imperfect relationship, a correlation of about 0.5, as captured by the line running through the data points in the chart below. It tells us that when the father is six feet, we should make a compromise prediction based on both the father’s height and the population average. Our best guess for the son is five feet ten. The son’s height has regressed toward the mean by two inches, halfway between the population average and the father’s height.
Ed Carmichael
So for someone above or below average, we should assume their kids will move closer to the mean even if the attribute is genetically heritable
30%
Flag icon
Of course it’s when you have one of those awful days that you are most likely to seek help by visiting a homeopath or some other dispenser of medical treatments unsupported by solid scientific evidence. The next day you wake up and…you feel better! The treatment works! The placebo effect may have helped, but you probably would have felt better the next day even if you had received no treatment at all—thanks to regression to the mean, a fact that won’t occur to you unless you stop and think carefully, instead of going with the tip-of-your-nose conclusion.
31%
Flag icon
If their results were entirely decided by skill, there would be no regression: Frank would be just as awful in year 2 and Nancy would be just as spectacular.
Ed Carmichael
So regression to the mean is faster and more likely depending on how much the outcome of a given activity is dictated by luck vs skill
31%
Flag icon
as Edward Lorenz showed, means even something as tiny as a butterfly’s wing flaps can make a dramatic difference to what happens.
31%
Flag icon
This suggests that being recognized as “super” and placed on teams of intellectually stimulating colleagues improved their performance enough to erase the regression to the mean we would otherwise have seen.
32%
Flag icon
If superforecasting is a job for three-standard-deviation MENSA-certified geniuses—the top 1%—then the vast majority of us can never qualify. So why bother trying?
Ed Carmichael
So to get to the top 1%you have to be 3standard deviations away from the mean; remember standard deviation is the square root of the average of the sum of squares (squared and then square root taken to get out negative numbers) of how far individual values are from the mean
32%
Flag icon
measured crystallized intelligence—knowledge—using some U.S.-centric questions like “How many Justices sit on the Supreme Court?” and more global questions like “Which nations are permanent members of the UN Security Council?”
Ed Carmichael
Let’s see how I do - respectively, 9; and US, China, Russia, France, UK - believe they’re collectively referred to as the P5; I’m right! Note: there are TEN non-permanent members of the security council, elected for 2 year terms, so there’s 15 members total at any given time
33%
Flag icon
Regular forecasters scored higher on intelligence and knowledge tests than about 70% of the population. Superforecasters did better, placing higher than about 80% of the population.
33%
Flag icon
Second, although superforecasters are well above average, they did not score off-the-charts high and most fall well short of so-called genius territory, a problematic concept often arbitrarily defined as the top 1%, or an IQ of 135 and up.
Ed Carmichael
An IQ of above 135 puts you in the top 1%
33%
Flag icon
Ultimately, it’s not the crunching power that counts. It’s how you use it.
33%
Flag icon
The Italian American physicist Enrico Fermi—a central figure in the invention of the atomic bomb—
33%
Flag icon
Fermi knew people could do much better and the key was to break down the question with more questions like “What would have to be true for this to happen?” Here, we can break the question down by asking, “What information would allow me to answer the question?”
34%
Flag icon
narrow this down, Fermi would advise setting a confidence interval—a range that you are 90% sure contains the right answer.
34%
Flag icon
On October 12, 2004, Yasser Arafat, the seventy-five-year-old leader of the Palestine Liberation Organization, became severely ill with vomiting and abdominal pain. Over the next three weeks, his condition worsened. On October 29, he was flown to a hospital in France. He fell into a coma.
34%
Flag icon
but on November 11, 2004, the man who was once a seemingly indestructible enemy of Israel was pronounced dead. What killed him was uncertain. But even before he died there was speculation that he had been poisoned. In July 2012 researchers at Switzerland’s
34%
Flag icon
they had tested some of Arafat’s belongings and discovered unnaturally high levels of polonium-210. That was ominous. Polonium-210 is a radioactive element that can be deadly if ingested.
34%
Flag icon
In 2006 Alexander Litvinenko—a former Russian spy living in London and a prominent critic of Vladimir Putin—was murdered with polonium-210.
Ed Carmichael
Speculation that Putin killed him I’m guessing?
35%
Flag icon
He has no expertise in the Israeli-Palestinian conflict, to say the least. But he didn’t need any to get off to a great start on this question. Thinking like Fermi, Bill unpacked the question by asking himself “What would it take for the answer to be yes? What would it take for it to be no?”
Ed Carmichael
Cool method to answer a question - what would it take for the answer to be yes? What would it take for the answer to be no? What would have to be true?
35%
Flag icon
This sort of storytelling can be very compelling, particularly when the available details are much richer than what I’ve provided here. But superforecasters wouldn’t bother with any of that, at least not at first. The first thing they would do is find out what percentage of American households own a pet.
Ed Carmichael
So it may be better not to concern yourself with individual details and more focus in the macro trends first - to determine how statistically likely something is - don’t get pulled into the storytelling
35%
Flag icon
It’s natural to be drawn to the inside view. It’s usually concrete and filled with engaging detail we can use to craft a story about what’s going on. The outside view is typically abstract, bare, and doesn’t lend itself so readily to storytelling.
36%
Flag icon
If Bill Flack were asked whether, in the next twelve months, there would be an armed clash between China and Vietnam over some border dispute, he wouldn’t immediately delve into the particulars of that border dispute and the current state of China-Vietnam relations. He would instead look at how often there have been armed clashes in the past. “Say we get hostile conduct between China and Vietnam every five years,” Bill says. “I’ll use a five-year recurrence model to predict the future.” In any given year, then, the outside view would suggest to Bill there is a 20% chance of a clash. Having ...more
36%
Flag icon
After all, you could dive into the inside view and draw conclusions, then turn to the outside view. Wouldn’t that work as well? Unfortunately, no, it probably wouldn’t. The reason is a basic psychological concept called anchoring. When we make estimates, we tend to start with some number and adjust. The number we start with is called the anchor. It’s important because we typically underadjust, which means a bad anchor can easily produce a bad estimate. And it’s astonishingly easy to settle on a bad anchor.
36%
Flag icon
So a forecaster who starts by diving into the inside view risks being swayed by a number that may have little or no meaning. But if she starts with the outside view, her analysis will begin with an anchor that is meaningful. And a better anchor is a distinct advantage.
36%
Flag icon
A good exploration of the inside view does not involve wandering around, soaking up any and all information and hoping that insight somehow emerges. It is targeted and purposeful: it is an investigation, not an amble.
36%
Flag icon
Start with the first hypothesis: Israel poisoned Yasser Arafat with polonium. What would it take for that to be true? 1. Israel had, or could obtain, polonium. 2. Israel wanted Arafat dead badly enough to take a big risk. 3. Israel had the ability to poison Arafat with polonium. Each of these elements could then be researched—looking for evidence pro and con—to get a sense of how likely they are to be true, and therefore how likely the hypothesis is to be true.
Ed Carmichael
Break each hypothesis down to its constituent elements - and then research how likely those elements are to be true
37%
Flag icon
First, he found a list of Islamist terror attacks on Wikipedia. Then he counted the number of attacks in the specified countries over the previous five years. There were six. “So I calculate the base rate as 1.2/year,” he wrote in the GJP forum.
37%
Flag icon
There were 69 days left in the forecast period. So David divided 69 by 365. Then he multiplied by 1.8. Result: 0.34. So he concluded that there is a 34% chance the answer to IARPA’s question would be yes.
Ed Carmichael
Example of how superforecasters make their calculations - the actual math of it
37%
Flag icon
When Bill Flack makes a judgment, he often explains his thinking to his teammates, as David Rogg did, and he asks them to critique it. In part, he does that because he hopes they’ll spot flaws and offer their own perspectives. But writing his judgment down is also a way of distancing himself from it, so he can step back and scrutinize it: “It’s an auto-feedback thing,” he says. “Do I agree with this? Are there holes in this? Should I be looking for something else to fill this in? Would I be convinced by this if I were somebody else?”
37%
Flag icon
Researchers have found that merely asking people to assume their initial judgment is wrong, to seriously consider why that might be, and then make another judgment, produces a second estimate which, when combined with the first, improves accuracy almost as much as getting a second estimate from another person.
Ed Carmichael
Assuming I’m wrong for the sake of exploring WHY that might be - and then later producing a second estimate
37%
Flag icon
There is an even simpler way of getting another perspective on a question: tweak its wording. Imagine a question like “Will the South African government grant the Dalai Lama a visa within six months?”
37%
Flag icon
To check that tendency, turn the question on its head and ask, “Will the South African government deny the Dalai Lama for six months?” That tiny wording change encourages you to lean in the opposite direction and look for reasons why it would deny the visa—a desire not to anger its biggest trading partner being a rather big one.
38%
Flag icon
Why do they put so much into it? One answer is it’s fun. “Need for cognition” is the psychological term for the tendency to engage in and enjoy hard mental slogs.
Ed Carmichael
This is me to some extent - I have a high need for cognition - to be mentally stimulated
38%
Flag icon
An element of personality is also likely involved. In personality psychology, one of the “Big Five” traits is “openness to experience,” which has various dimensions, including preference for variety and intellectual curiosity. It’s unmistakable in many superforecasters.
Ed Carmichael
Take every opportunity I can to learn something about the world - don’t see it as pointless
38%
Flag icon
Doug is not merely open-minded. He is actively open-minded.
Ed Carmichael
You don’t want to jus be open minded but ACTIVELY open minded. Actively seek out diversity of thought
38%
Flag icon
For superforecasters, beliefs are hypotheses to be tested, not treasures to be guarded.
38%
Flag icon
arcane
Ed Carmichael
Arcane = understood by few; a sort of secret knowledge
38%
Flag icon
As the scientist and science fiction writer Arthur C. Clarke famously observed, “Any sufficiently advanced technology is indistinguishable from magic.”
38%
Flag icon
occult
Ed Carmichael
Can be used like “arcane”
38%
Flag icon
Monte Carlo model
Ed Carmichael
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle - so its an algorithm to help you make a prediction
38%
Flag icon
Aramaic
Ed Carmichael
Ancient Syrian
38%
Flag icon
Monte Carlo models.
39%
Flag icon
A smart executive will not expect universal agreement, and will treat its appearance as a warning flag that groupthink has taken hold.
39%
Flag icon
It was “the wisdom of the crowd,” gift wrapped. All he had to do was synthesize the judgments. A simple averaging would be a good start. Or he could do a weighted averaging—so that those whose judgment he most respects get more say in the collective conclusion.
Ed Carmichael
This would have been an example of Dalio’s meritocratic or believability weighted decision making systems
41%
Flag icon
1713 publication of Jakob Bernoulli’s Ars Conjectandi—before the best minds started to think seriously about probability.
41%
Flag icon
Why is a decline from 5% to 0% so much more valuable than a decline from 10% to 5%? Because it delivers more than a 5% reduction in risk. It delivers certainty.