Noise: A Flaw in Human Judgment
Rate it:
Open Preview
Read between December 12 - December 14, 2021
64%
Flag icon
Rank but Do Not Force In principle, therefore, forced ranking should bring about much-needed improvements. Yet it often backfires. We do not intend here to review all its possible unwanted effects (which are often related to poor implementation rather than principle). But two issues with forced ranking systems offer some general lessons.
64%
Flag icon
The first is the confusion between absolute and relative performance. It is certainly impossible for 98% of the managers of any company to be in the top 20%, 50%, or even 80% of their peer group. But it is not impossible that they all “meet expectations,” if these expectations have been defined ex ante and in absolute terms.
64%
Flag icon
The upshot is that a system that depends on relative evaluations is appropriate only if an organization cares about relative performance.
64%
Flag icon
The second problem is that the forced distribution of the ratings is assumed to reflect the distribution of the underlying true performances—typically, something close to a normal distribution. Yet even if the distribution of performances in the population being rated is known, the same distribution may not be reproduced in a smaller group, such as those assessed by a single evaluator. If you randomly pick ten people from a population of several thousand, there is no guarantee that exactly two of them will belong to the top 20% of the general population. (“No guarantee” is an understatement: ...more
64%
Flag icon
Whether or not you accept these arguments, the fatal flaw of forced ranking is not the “ranking,” but the “forced.” Whenever judgments are forced onto an inappropriate scale, either because a relative scale is used to measure an absolute performance or because judges are forced to distinguish the indistinguishable, the choice of the scale mechanically adds noise.
64%
Flag icon
Although performance feedback, when associated with a development plan for the employee, can bring about improvements, performance ratings as they are most often practiced demotivate as often as they motivate. As one review article summarized, “No matter what has been tried over decades to improve [performance management] processes, they continue to generate inaccurate information and do virtually nothing to drive performance.”
65%
Flag icon
With a case scale, each rating of a new individual is a comparison with the anchor cases. It becomes a relative judgment. Because comparative judgments are less susceptible to noise than ratings are, case scales are more reliable than scales that use numbers, adjectives, or behavioral descriptions.
65%
Flag icon
If you are designing or revising a performance management system, you will need to answer these questions and many more. Our aspiration here is not to examine these questions but to make a more modest suggestion: if you do measure performance, your performance ratings have probably been pervaded by system noise and, for that reason, they might be essentially useless and quite possibly counterproductive. Reducing this noise is a challenge that cannot be solved by simple technological fixes. It requires clear thinking about the judgments that raters are expected to make. Most likely, you will ...more
65%
Flag icon
Speaking of Defining the Scale “We spend a lot of time on our performance ratings, and yet the results are one-quarter performance and three-quarters system noise.” “We tried 360-degree feedback and forced ranking to address this problem, but we may have made things worse.” “If there is so much level noise, it is because different raters have completely different ideas of what ‘good’ or ‘great’ means. They will only agree if we give them concrete cases as anchors on the rating scale.”
65%
Flag icon
If you are now in a position to hire employees, your selection methods probably include some version of this ritual. As one organizational psychologist noted, “It is rare, even unthinkable, for someone to be hired without some type of interview.” And almost all professionals rely to some degree on their intuitive judgments when making hiring decisions in these interviews.
65%
Flag icon
A caveat is needed here. The definition of success is a nontrivial problem. Typically, performance is evaluated on the basis of supervisor ratings. Sometimes, the metric is length of employment. Such measures raise questions, of course, especially given the questionable validity of performance ratings, which we noted in the previous chapter. However, for the purpose of evaluating the quality of an employer’s judgments when selecting employees, it seems reasonable to use the judgments that the same employer makes when evaluating the employees thus hired. Any analysis of the quality of hiring ...more
66%
Flag icon
We can easily see why traditional interviews produce error in their prediction of job performance. Some of this error has to do with what we have termed objective ignorance (see chapter 11). Job performance depends on many things, including how quickly the person you hire adjusts to her new position or how various life events affect her work. Much of this is unpredictable at the time of hiring. This uncertainty limits the predictive validity of interviews and, indeed, any other personnel selection technique.
66%
Flag icon
First impressions turn out to matter—a lot. Perhaps you think that judging on first impressions is unproblematic. At least some of what we learn from first impressions is meaningful. All of us know that we do learn something in the first seconds of interaction with a new acquaintance. It stands to reason that this may be particularly true of skilled interviewers. But the first seconds of an interview reflect exactly the sort of superficial qualities you associate with first impressions: early perceptions are based mostly on a candidate’s extraversion and verbal skills. Even the quality of a ...more
66%
Flag icon
As we can often find an imaginary pattern in random data or imagine a shape in the contours of a cloud, we are capable of finding logic in perfectly meaningless answers.
66%
Flag icon
The story illustrates that however much we would like to believe that our judgment about a candidate is based on facts, our interpretation of facts is colored by prior attitudes.
67%
Flag icon
Improving Personnel Selection Through Structure
67%
Flag icon
Google also adopted a decision hygiene strategy we haven’t yet described in detail: structuring complex judgments. The term structure can mean many things. As we use the term here, a structured complex judgment is defined by three principles: decomposition, independence, and delayed holistic judgment.
67%
Flag icon
The first principle, decomposition, breaks down the decision into components, or mediating assessments.
67%
Flag icon
Creating this sort of structure for a recruiting task may seem like mere common sense. Indeed, if you are hiring an entry-level accountant or an administrative assistant, standard job descriptions exist and specify the competencies needed. As professional recruiters know, however, defining the key assessments gets difficult for unusual or senior positions, and this step of definition is frequently overlooked. One prominent headhunter points out that defining the required competencies in a sufficiently specific manner is a challenging, often overlooked task. He highlights the importance for ...more
1 4 6 Next »