More on this book
Community
Kindle Notes & Highlights
constructing teams of judges
who are selected for being both good at what they do and complementary to one another.
another gain in accuracy can be obtained by combining judgments that are both independent and complementary.
standard tool for that task is multiple regression
The test that best predicts the outcome is selected first.
one that adds the most predictive power to the first test, by providing predictions that are both valid and not redundant with the first.
Paradoxically, the average of that noisy group will be more accurate than the average of a unanimous one.
aggregation can only reduce noise if judgments are truly independent.
Organizations that want to harness the power of diversity must welcome the disagreements that will arise when team members reach their judgments independently.
average
perpetual beta,
relevant base rate?”
how can we ensure more diversity of opinions?”
For many conditions, the diagnosis is routine and largely mechanical, and rules and procedures are in place to minimize noise.
shifting from judgment to calculation.
second opinion.
have been astonished to see how much the second opinion diverges from the first.
sheer magnitude.
describe some of the approaches to noise reduction used by the...
This highlight has been truncated due to consecutive passage length restrictions.
one decision hygiene ...
This highlight has been truncated due to consecutive passage length restrictions.
development of diagnostic ...
This highlight has been truncated due to consecutive passage length restrictions.
Treatments can also be noisy,
best treatment are shockingly variable,
conclusions hold in numerous nations.
skill matters a lot.
“policies that improve skill perform better than uniform decision guidelines.”
Radiologists, for example, call diagnostic variation their “Achilles’ heel.”
In medicine, between-person noise, or interrater reliability, is usually measured by the kappa statistic.
value of 1 reflects perfect agreement;
reviewing one hundred randomly selected drug-drug interactions, showed “poor agreement.”
It is worth pausing over these findings.
describe these findings
convey a general sense of the pervasiveness of noise,
documented, potentially leading to unnecessary procedures.
problem has yet to be solved.
They disagreed dramatically, with weak correlations on both number and location.
detecting TB is a chest X-ray,
Variability in diagnosis of TB has been well documented for almost seventy-five years.
also variability in TB diagnoses between radiologists in different countries.
doctors misdiagnosed melanomas in one of every three lesions.
failed to diagnose melanoma from skin biopsies
large study found that the range of false negatives among different radiologists varied from 0% (the radiologist was correct every time) to greater than 50%
False negatives and false positives, from different radiologists, ensure that there is noise.
In areas that involve vague criteria and complex judgments, intrarater reliability, as it is called, can be poor.
doctors are significantly more likely to order cancer screenings early in the morning than late in the afternoon.
physicians almost inevitably run behind in clinic after seeing patients with complex medical problems that require more than the usual twenty-minute slot.
Another illustration of the role of fatigue among clinicians is the lower rate of appropriate handwashing during the end of hospital shifts.
great deal of room for judgment, and the relevant criteria for diagnosis are so open-ended that noise will be substantial and difficult to reduce.
this is the case in much of psychiatry.
doctors are now using deep-learning algorithms and artificial intelligence to reduce noise.