More on this book
Community
Kindle Notes & Highlights
wherever there is judgment, there is noise—and more of it than you think.
The psychology of this process is well understood. Confidence is nurtured by the subjective experience of judgments that are made with increasing fluency and ease, in part because they resemble judgments made in similar cases in the past. Over time, as this underwriter learned to agree with her past self, her confidence in her judgments increased. She gave no indication that—after the initial apprenticeship phase—she had learned to agree with others, had checked to what extent she did agree with them, or had even tried to prevent her practices from drifting away from those of her colleagues.
the amount of noise observed when an organization takes a serious look almost always comes as a shock. Our conclusion is simple: wherever there is judgment, there is noise, and more of it than you think.
Noise in evaluative judgments is problematic for a different reason. In any system in which judges are assumed to be interchangeable and assigned quasi-randomly, large disagreements about the same case violate expectations of fairness and consistency.
Good decision making must be based on objective and accurate predictive judgments that are completely unaffected by hopes and fears, or by preferences and values.
the average level of sentencing functions like a personality trait. You could use this study to arrange judges on a scale that ranges from very harsh to very lenient, just as a personality test might measure their degree of extraversion or agreeableness.
we do not always produce identical judgments when faced with the same facts on two occasions.
group outcomes can be manipulated fairly easily, because popularity is self-reinforcing.
independence is a prerequisite for the wisdom of crowds.
The potential dependence of outcomes on the judgments of a few individuals—those who speak first or who have the largest influence—should be especially worrisome now that we have explored how noisy individual judgments can be.
clinicians and other professionals are distressingly weak in what they often see as their unique strength: the ability to integrate information.
trivial formulas, consistently applied, outdo clinical judgment.
Predictions did not lose accuracy when the model generated predictions. They improved. In most cases, the model out-predicted the professional on which it was based. The ersatz was better than the original product.
the machine-learning model performs much better than human judges do at predicting which defendants are high risks.
all mechanical prediction techniques, not just the most recent and more sophisticated ones, represent significant improvements on human judgment.
We expect machines to be perfect. If this expectation is violated, we discard them. Because of this intuitive expectation, however, people are likely to distrust algorithms and keep using their judgment, even when this choice produces demonstrably inferior results. This attitude is deeply rooted and unlikely to change until near-perfect predictive accuracy can be achieved.
Intelligence community analysts are trained to make accurate forecasts; they are not amateurs. In addition, they have access to classified information. And yet they do not do as well as the superforecasters do.
Organizations that want to harness the power of diversity must welcome the disagreements that will arise when team members reach their judgments independently. Eliciting and aggregating judgments that are both independent and diverse will often be the easiest, cheapest, and most broadly applicable decision hygiene strategy.
The medical profession is likely to rely on algorithms more and more in the future; they promise to reduce both bias and noise and to save lives and money in the process.
In terms of noise, psychiatry is an extreme case. When diagnosing the same patient using the same diagnostic criteria, psychiatrists frequently disagree with one another.
In medicine in general, guidelines have been highly successful in reducing both bias and noise. They have helped doctors, nurses, and patients and greatly improved public health in the process. The medical profession needs more of them.
if your goal is to bring out the best in people, you can reasonably ask whether measuring individual performance and using that measurement to motivate people through fear and greed is the best approach (or even an effective one).
the first seconds of an interview reflect exactly the sort of superficial qualities you associate with first impressions: early perceptions are based mostly on a candidate’s extraversion and verbal skills. Even the quality of a handshake is a significant predictor of hiring recommendations! We may all like a firm handshake, but few recruiters would consciously choose to make it a key hiring criterion.
We have a structured process to make hiring decisions. Why don’t we have one for strategic decisions? After all, options are like candidates.
For many people, a key practical consideration is whether an algorithm has a disparate impact on identifiable groups. Exactly how to test for disparate impact, and how to decide what constitutes discrimination, bias, or fairness for an algorithm, are surprisingly complex topics, well beyond the scope of this book. The fact that this question can be raised at all, however, is a distinct advantage of algorithms over human judgments.
Shakespeare’s Merchant of Venice is easily read as an objection to noise-free rules and a plea for a role of mercy in law and in human judgment generally.
in some cases, noise can be counted as a rights violation, and in general, legal systems all over the world should be making much greater efforts to control noise. Consider criminal sentencing; civil fines for wrongdoing; and the grant or denial of asylum, educational opportunities, visas, building permits, and occupational licenses.
Noise is the unwanted variability of judgments,
Noise is variability in judgments that should be identical.
The surprises that motivated this book are the sheer magnitude of system noise and the amount of damage that it does. Both of these far exceed common expectations.
It is difficult for us to imagine that mindless adherence to simple rules will often achieve higher accuracy than we can—but this is by now a well-established fact.
The goal of judgment is accuracy, not individual expression.