Noise: A Flaw in Human Judgment
Rate it:
Read between December 12 - December 19, 2021
22%
Flag icon
People learn from others, and if early speakers seem to like something or want to do something, others might assent. At least this is so if they do not have reason to distrust them and if they lack a good reason to think that they are wrong. For our purposes, the most important point is that informational cascades make noise across groups possible and even likely.
22%
Flag icon
However, the study of juries uncovers a distinct kind of social influence that is also a source of noise: group polarization. The basic idea is that when people speak with one another, they often end up at a more extreme point in line with their original inclinations. If, for example, most people in a seven-person group tend to think that opening a new office in Paris would be a pretty good idea, the group is likely to conclude, after discussion, that opening that office would be a terrific idea. Internal discussions often create greater confidence, greater unity, and greater extremism, ...more
23%
Flag icon
The explanations for group polarization are, in turn, similar to the explanations for cascade effects. Information plays a major role. If most people favor a severe punishment, then the group will hear many arguments in favor of severe punishment—and fewer arguments the other way. If group members are listening to one another, they will shift in the direction of the dominant tendency, rendering the group more unified, more confident, and more extreme. And if people care about their reputation within the group, they will shift in the direction of the dominant tendency, which will also produce ...more
26%
Flag icon
In predictive judgments, human experts are easily outperformed by simple formulas—models of reality, models of a judge, or even randomly generated models. This finding argues in favor of using noise-free methods: rules and algorithms,
27%
Flag icon
In fact, many types of mechanical approaches, from almost laughably simple rules to the most sophisticated and impenetrable machine algorithms, can outperform human judgment. And one key reason for this outperformance—albeit not the only one—is that all mechanical approaches are noise-free.
29%
Flag icon
it is worth asking why algorithms are not used much more extensively for the types of professional judgments we discuss in this book. For all the spirited talk about algorithms and machine learning, and despite important exceptions in particular fields, their use remains limited. Many experts ignore the clinical-versus-mechanical debate, preferring to trust their judgment. They have faith in their intuitions and doubt that machines could do better. They regard the idea of algorithmic decision making as dehumanizing and as an abdication of their responsibility.
30%
Flag icon
Both intractable uncertainty (what cannot possibly be known) and imperfect information (what could be known but isn’t) make perfect prediction impossible. These unknowns are not problems of bias or noise in your judgment; they are objective characteristics of the task. This objective ignorance of important unknowns severely limits achievable accuracy. We take a terminological liberty here, replacing the commonly used uncertainty with ignorance. This term helps limit the risk of confusion between uncertainty, which is about the world and the future, and noise, which is variability in judgments ...more
30%
Flag icon
In general, however, you can safely expect that people who engage in predictive tasks will underestimate their objective ignorance. Overconfidence is one of the best-documented cognitive biases. In particular, judgments of one’s ability to make precise predictions, even from limited information, are notoriously overconfident. What we said of noise in predictive judgments can also be said of objective ignorance: wherever there is prediction, there is ignorance, and more of it than you think.
31%
Flag icon
Some years after his shocking discovery of the futility of much long-term forecasting, Tetlock teamed up with his spouse, Barbara Mellers, to study how well people do when asked to forecast world events in the relatively short term—usually less than a year. The team discovered that short-term forecasting is difficult but not impossible, and that some people, whom Tetlock and Mellers called superforecasters, are consistently better at it than most others, including professionals in the intelligence community. In the terms we use here, their new findings are compatible with the notion that ...more
31%
Flag icon
The previous chapters may have given you the impression that algorithms are crushingly superior to predictive judgments. That impression, however, would be misleading. Models are consistently better than people, but not much better. There is essentially no evidence of situations in which people do very poorly and models do very well with the same information.
31%
Flag icon
The denial of ignorance adds an answer to the puzzle that baffled Meehl and his followers: why his message has remained largely unheeded, and why decision makers continue to rely on their intuition. When they listen to their gut, decision makers hear the internal signal and feel the emotional reward it brings. This internal signal that a good judgment has been reached is the voice of confidence, of “knowing without knowing why.” But an objective assessment of the evidence’s true predictive power will rarely justify that level of confidence.
32%
Flag icon
The challenge is that the “price” in this situation is not the same. Intuitive judgment comes with its reward, the internal signal. People are prepared to trust an algorithm that achieves a very high level of accuracy because it gives them a sense of certainty that matches or exceeds that provided by the internal signal. But giving up the emotional reward of the internal signal is a high price to pay when the alternative is some sort of mechanical process that does not even claim high validity.
32%
Flag icon
This observation has an important implication for the improvement of judgment. Despite all the evidence in favor of mechanical and algorithmic prediction methods, and despite the rational calculus that clearly shows the value of incremental improvements in predictive accuracy, many decision makers will reject decision-making approaches that deprive them of the ability to exercise their intuition.
32%
Flag icon
In this chapter, we address the prevalent and misguided sense that events that could not have been predicted can nevertheless be understood.
33%
Flag icon
However, in the discourse of social science, and in most everyday conversations, a claim to understand something is a claim to understand what causes that thing. The sociologists who collected and studied the thousands of variables in the Fragile Families study were looking for the causes of the outcomes they observed. Physicians who understand what ails a patient are claiming that the pathology they have diagnosed is the cause of the symptoms they have observed. To understand is to describe a causal chain. The ability to make a prediction is a measure of whether such a causal chain has indeed ...more
33%
Flag icon
We must, however, remember that while correlation does not imply causation, causation does imply correlation. Where there is a causal link, we should find a correlation. If you find no correlation between age and shoe size among adults, then you can safely conclude that after the end of adolescence, age does not make feet grow larger and that you have to look elsewhere for the causes of differences in shoe size. In short, wherever there is causality, there is correlation. It follows that where there is causality, we should be able to predict—and correlation, the accuracy of this prediction, is ...more
33%
Flag icon
Whatever the outcome (eviction or not), once it has happened, causal thinking makes it feel entirely explainable, indeed predictable.
33%
Flag icon
In the valley of the normal, events unfold just like the Joneses’ eviction: they appear normal in hindsight, although they were not expected, and although we could not have predicted them. This is because the process of understanding reality is backward-looking. An occurrence that was not actively anticipated (the eviction of the Jones family) triggers a search of memory for a candidate cause (the tough job market, the inflexible manager). The search stops when a good narrative is found.
34%
Flag icon
More broadly, our sense of understanding the world depends on our extraordinary ability to construct narratives that explain the events we observe. The search for causes is almost always successful because causes can be drawn from an unlimited reservoir of facts and beliefs about the world. As anyone who listens to the evening news knows, for example, few large movements of the stock market remain unexplained. The same news flow can “explain” either a fall of the indices (nervous investors are worried about the news!) or a rise (sanguine investors remain optimistic!). When the search for an ...more
34%
Flag icon
Causal thinking avoids unnecessary effort while retaining the vigilance needed to detect abnormal events. In contrast, statistical thinking is effortful. It requires the attention resources that only System 2, the mode of thinking associated with slow, deliberate thought, can bring to bear. Beyond an elementary level, statistical thinking also demands specialized training. This type of thinking begins with ensembles and considers individual cases as instances of broader categories.
34%
Flag icon
The reliance on flawed explanations is perhaps inevitable, if the alternative is to give up on understanding our world. However, causal thinking and the illusion of understanding the past contribute to overconfident predictions of the future. As we will see, the preference for causal thinking also contributes to the neglect of noise as a source of error, because noise is a fundamentally statistical notion. Causal thinking helps us make sense of a world that is far less predictable than we think. It also explains why we view the world as far more predictable than it really is. In the valley of ...more
34%
Flag icon
This book extends half a century of research on intuitive human judgment, the so-called heuristics and biases program. The first four decades of this research program were reviewed in Thinking, Fast and Slow, which explored the psychological mechanisms that explain both the marvels and the flaws of intuitive thinking. The central idea of the program was that people who are asked a difficult question use simplifying operations, called heuristics. In general, heuristics, which are produced by fast, intuitive thinking, also known as System 1 thinking, are quite useful and yield adequate answers. ...more
35%
Flag icon
Given how much we stressed that statistical bias can be detected only when the true value is known, you may wonder how psychological biases can be studied when the truth is unknown. The answer is that researchers confirm a psychological bias either by observing that a factor that should not affect judgment does have a statistical effect on it, or that a factor that should affect judgment does not.
36%
Flag icon
A few minutes of research would reveal that estimates of CEO turnover in US companies hover around 15% annually. This statistic suggests that the average incoming CEO has a roughly 72% probability of still being around after two years. Of course, this number is only a starting point, and the specifics of Gambardi’s case will affect your final estimate. But if you focused solely on what you were told about Gambardi, you neglected a key piece of information. (Full disclosure: We wrote the Gambardi case to illustrate noisy judgment; it took us weeks before we realized that it was also a prime ...more
36%
Flag icon
Substituting a judgment of how easily examples come to mind for an assessment of frequency is known as the availability heuristic.
36%
Flag icon
Consider how we tend to answer each of the following questions by using its easier substitute: Do I believe in climate change? Do I trust the people who say it exists? Do I think this surgeon is competent? Does this individual speak with confidence and authority? Will the project be completed on schedule? Is it on schedule now? Is nuclear energy necessary? Do I recoil at the word nuclear?
36%
Flag icon
This example illustrates a different type of bias, which we call conclusion bias, or prejudgment. Like Lucas, we often start the process of judgment with an inclination to reach a particular conclusion. When we do that, we let our fast, intuitive System 1 thinking suggest a conclusion. Either we jump to that conclusion and simply bypass the process of gathering and integrating information, or we mobilize System 2 thinking—engaging in deliberate thought—to come up with arguments that support our prejudgment. In that case, the evidence will be selective and distorted: because of confirmation ...more
48%
Flag icon
Three things matter. Judgments are both less noisy and less biased when those who make them are well trained, are more intelligent, and have the right cognitive style. In other words: good judgments depend on what you know, how well you think, and how you think. Good judges tend to be experienced and smart, but they also tend to be actively open-minded and willing to learn from new information.
49%
Flag icon
Yet some professionals in these domains come to be called experts. The confidence we have in these experts’ judgment is entirely based on the respect they enjoy from their peers. We call them respect-experts. The term respect-expert is not meant to be disrespectful. The fact that some experts are not subject to an evaluation of the accuracy of their judgments is not a criticism; it is a fact of life in many domains.
50%
Flag icon
GMA contributes significantly to the quality of performance in occupations that require judgment, even within a pool of high-ability individuals. The notion that there is a threshold beyond which GMA ceases to make a difference is not supported by the evidence. This conclusion in turn strongly suggests that if professional judgments are unverifiable but assumed to reach for an invisible bull’s-eye, then the judgments of high-ability people are more likely to be close. If you must pick people to make judgments, picking those with the highest mental ability makes a lot of sense.
50%
Flag icon
We do not aim here to draw hard-and-fast conclusions about how to pick individuals who will make good judgments in a given domain. But two general principles emerge from this brief review. First, it is wise to recognize the difference between domains in which expertise can be confirmed by comparison with true values (such as weather forecasting) and domains that are the province of respect-experts.
51%
Flag icon
Second, some judges are going to be better than their equally qualified and experienced peers. If they are better, they are less likely to be biased or noisy. Among many things that explain these differences, intelligence and cognitive style matter. Although no single measure or scale unambiguously predicts judgment quality, you may want to look for the sort of people who actively search for new information that could contradict their prior beliefs, who are methodical in integrating that information into their current perspective, and who are willing, even eager, to change their minds as a ...more
51%
Flag icon
The personality of people with excellent judgment may not fit the generally accepted stereotype of a decisive leader. People often tend to trust and like leaders who are firm and clear and who seem to know, immediately and deep in their bones, what is right. Such leaders inspire confidence. But the evidence suggests that if the goal is to reduce error, it is better for leaders (and others) to remain open to counterarguments and to know that they might be wrong. If they end up being decisive, it is at the end of a process, not at the start.
51%
Flag icon
Ex post, or corrective, debiasing is often carried out intuitively. Suppose that you are supervising a team in charge of a project and that the team estimates that it can complete its project in three months. You might want to add a buffer to the members’ judgment and plan for four months, or more, thus correcting a bias (the planning fallacy) you assume is present.
51%
Flag icon
Ex ante or preventive debiasing interventions fall in turn into two broad categories. Some of the most promising are designed to modify the environment in which the judgment or decision takes place. Such modifications, or nudges, as they are known, aim to reduce the effect of biases or even to enlist biases to produce a better decision. A simple example is automatic enrollment in pension plans. Designed to overcome inertia, procrastination, and optimistic bias, automatic enrollment ensures that employees will be saving for retirement unless they deliberately opt out.
51%
Flag icon
A different type of ex ante debiasing involves training decision makers to recognize their biases and to overcome them.
52%
Flag icon
We suggest undertaking this search for biases neither before nor after the decision is made, but in real time. Of course, people are rarely aware of their own biases when they are being misled by them. This lack of awareness is itself a known bias, the bias blind spot. People often recognize biases more easily in others than they do in themselves. We suggest that observers can be trained to spot, in real time, the diagnostic signs that one or several familiar biases are affecting someone else’s decisions or recommendations.
52%
Flag icon
Bias is error we can often see and even explain. It is directional: that is why a nudge can limit the detrimental effects of a bias, or why an effort to boost judgment can combat specific biases. It is also often visible: that is why an observer can hope to diagnose biases in real time as a decision is being made. Noise, on the other hand, is unpredictable error that we cannot easily see or explain. That is why we so often neglect it—even when it causes grave damage. For this reason, strategies for noise reduction are to debiasing what preventive hygiene measures are to medical treatment: the ...more
52%
Flag icon
The analogy with handwashing is intentional. Hygiene measures can be tedious. Their benefits are not directly visible; you might never know what problem they prevented from occurring. Conversely, when problems do arise, they may not be traceable to a specific breakdown in hygiene observance. For these reasons, handwashing compliance is difficult to enforce, even among health-care professionals, who are well aware of its importance.
55%
Flag icon
The necessary methodological steps are relatively simple. They illustrate a decision hygiene strategy that has applicability in many domains: sequencing information to limit the formation of premature intuitions. In any judgment, some information is relevant, and some is not. More information is not always better, especially if it has the potential to bias judgments by leading the judge to form a premature intuition.
55%
Flag icon
Dror has another recommendation that illustrates the same decision hygiene strategy: examiners should document their judgments at each step. They should document their analysis of a latent fingerprint before they look at exemplar fingerprints to decide whether they are a match. This sequence of steps helps experts avoid the risk that they see only what they are looking for. And they should record their judgment on the evidence before they have access to contextual information that risks biasing them.
55%
Flag icon
When a different examiner is called on to verify the identification made by the first person, the second person should not be aware of the first judgment.
56%
Flag icon
Less obvious is the possibility that your judgment can be altered by another trigger of occasion noise: information—even when it is accurate information. As in the example of the fingerprint examiners, as soon as you know what others think, confirmation bias can lead you to form an overall impression too early and to ignore contradictory information.
56%
Flag icon
We will not review them exhaustively here, but we will focus on two noise-reduction strategies that have broad applicability. One is an application of the principle we mentioned in chapter 18: selecting better judges produces better judgments. The other is one of the most universally applicable decision hygiene strategies: aggregating multiple independent estimates. The easiest way to aggregate several forecasts is to average them. Averaging is mathematically guaranteed to reduce noise: specifically, it divides it by the square root of the number of judgments averaged. This means that if you ...more
57%
Flag icon
Some of the most innovative work on the quality of forecasting, going well beyond what we have explored thus far, started in 2011, when three prominent behavioral scientists founded the Good Judgment Project. Philip Tetlock (whom we encountered in chapter 11 when we discussed his assessment of long-term forecasts of political events); his spouse, Barbara Mellers; and Don Moore teamed up to improve our understanding of forecasting and, in particular, why some people are good at it.
57%
Flag icon
Second, the researchers asked participants to make their forecasts in terms of probabilities that an event would happen, rather than a binary “it will happen” or “it will not happen.” To many people, forecasting means the latter—taking a stand one way or the other. However, given our objective ignorance of future events, it is much better to formulate probabilistic forecasts.
57%
Flag icon
So, now that we know how they were scored, how well did the Good Judgment Project volunteers do? One of the major findings was that the overwhelming majority of the volunteers did poorly, but about 2% stood out. As mentioned earlier, Tetlock calls these well-performing people superforecasters. They were hardly unerring, but their predictions were much better than chance. Remarkably, one government official said that the group did significantly “better than the average for intelligence community analysts who could read intercepts and other secret data.” This comparison is worth pausing over. ...more
57%
Flag icon
What makes superforecasters so good? Consistent with our argument in chapter 18, we could reasonably speculate that they are unusually intelligent. That speculation is not wrong. On GMA tests, the superforecasters do better than the average volunteer in the Good Judgment Project (and the average volunteer is significantly above the national average).
57%
Flag icon
But their real advantage is not their talent at math; it is their ease in thinking analytically and probabilistically.
57%
Flag icon
Consider superforecasters’ willingness and ability to structure and disaggregate problems. Rather than form a holistic judgment about a big geopolitical question (whether a nation will leave the European Union, whether a war will break out in a particular place, whether a public official will be assassinated), they break it up into its component parts. They ask, “What would it take for the answer to be yes? What would it take for the answer to be no?” Instead of offering a gut feeling or some kind of global hunch, they ask and try to answer an assortment of subsidiary questions.