Harald G.’s Kindle Notes & Highlights for The Tyranny of Metrics

The book is not about the evils of measurement of human performance, or of rewarding achievement, or of transparency. It is about “metric fixation”—a perversion of these, based upon beliefs that seem reasonable at first, but turn out to be unreasonable in practice. The title, The Tyranny of Metrics is not meant to convey the message that metrics are intrinsically tyrannical, but rather that they are frequently used in ways that are dysfunctional and oppressive.

7%

There are things that can be measured. There are things that are worth measuring. But what can be measured is not always what is worth measuring; what gets measured may have no relationship to what we really want to know. The costs of measuring may be greater than the benefits. The things that get measured may draw effort away from the things we really care about. And measurement may provide us with distorted knowledge—knowledge that seems solid but is actually deceptive.

7%

Used properly, measurement, as we’ll see, can be a good thing. So can transparency. But they can also distort, divert, displace, distract, and discourage.

8%

The most characteristic feature of metric fixation is the aspiration to replace judgment based on experience with standardized measurement. For judgment is understood as personal, subjective, and self-interested. Metrics, by contrast, are supposed to provide information that is hard and objective.

8%

Concrete interests of power, money, and status are at stake. Metric fixation leads to a diversion of resources away from frontline producers toward managers, administrators, and those who gather and manipulate data.

10%

As I began to investigate these issues, a book by a sociologist at the Harvard Business School, Rakesh Khurana’s From Higher Aims to Hired Hands: The Social Transformation of American Business Schools and the Unfulfilled Promise of Management as a Profession, opened my eyes to the intellectual history of business schools themselves, and the broader impact of what gets taught in them.

11%

A key premise of metric fixation concerns the relationship between measurement and improvement. There is a dictum (wrongly) attributed to the great nineteenth-century physicist Lord Kelvin: “If you cannot measure it, you cannot improve it.” In 1986 the American management guru, Tom Peters, embraced the motto, “What gets measured gets done,” which became a cornerstone belief of metrics.3 In time, some drew the conclusion that “anything that can be measured can be improved.”

11%

The key components of metric fixation are ■ the belief that it is possible and desirable to replace judgment, acquired by personal experience and talent, with numerical indicators of comparative performance based upon standardized data (metrics); ■ the belief that making such metrics public (transparent) assures that institutions are actually carrying out their purposes (accountability); ■ the belief that the best way to motivate people within these organizations is by attaching rewards and penalties to their measured performance, rewards that are either monetary (pay-for-performance) or ...more

12%

measuring only a few aspects creates incentives to neglect the rest.8 When organizations committed to metrics wake up to this fact, they typically add more performance measures—which creates a cascade of data, data that becomes ever less useful, while gathering it sucks up more and more time and resources.

12%

In the process, the nature of work is transformed in ways that are often pernicious. Professionals tend to resent the impositions of goals that may conflict with their vocational ethos and judgment, and thus morale is lowered. Almost inevitably, many people become adept at manipulating performance indicators through a variety of methods, many of which are ultimately dysfunctional for their organizations. They fudge the data or deal only with cases that will improve performance indicators.

13%

Let’s begin with problems of the distortion of information. Measuring the most easily measurable. There is a natural human tendency to try to simplify problems by focusing on the most easily measureable elements.1 But what is most easily measured is rarely what is most important, indeed sometimes not important at all. That is the first source of metric dysfunction. Closely related is measuring the simple when the desired outcome is complex. Most jobs have multiple responsibilities and most organizations have multiple goals. Focusing measurement on just one responsibility or goal often leads to ...more

This highlight has been truncated due to consecutive passage length restrictions.

14%

Gaming through creaming. This takes place when practitioners find simpler targets or prefer clients with less challenging circumstances, making it easier to reach the metric goal, but excluding cases where success is more difficult to achieve. Improving numbers by lowering standards. One way of improving metric scores is by lowering the criteria for scoring. Thus, for example, graduation rates of high schools and colleges can be increased by lowering the standards for passing. Or airlines improve their on-time performance by increasing the scheduled flying time of their flights. Improving ...more

16%

The decades in which McNamara rose from business school professor, to Ford Motor Company executive, to Secretary of Defense, and finally to president of the World Bank also saw the transformation of American business schools. In an earlier era, business schools had focused on preparing their students for jobs in particular industries and enterprises. From the 1950s onward, the business school ideal became the general manager, equipped with a set of skills that were independent of particular industries. The core of managerial expertise was now defined as a distinct set of skills and techniques, ...more

16%

Before that, “expertise” meant the career-long accumulation of knowledge of a specific field, as one progressed from rung to rung within the same institution or business—accumulating what economists call “task-specific know-how.” Auto executives were “car guys”—men who had spent much of their professional life in the automotive industry. They were increasingly replaced by McNamara-like “bean counters,” adept at calculating costs and profit margins.

17%

The role of judgment grounded in experience and a deep knowledge of context was downplayed. The premise of managerialism is that the differences among organizations—including private corporations, government agencies, and universities—are less important than the similarities.

18%

The suspicion of authority was intrinsic to the post-1960s political left: to rely upon the judgment of experts was to surrender to the prejudices of established elites. Thus, the left had its reasons for advancing an agenda that professed to make institutions accountable and transparent, using the purportedly objective and scientific standards of measured performance. On the right there was the suspicion, sometimes well founded, that public-sector institutions were being run more for the benefit of their employees than their clients and constituents. In some schools, police departments, and ...more

20%

Those at the top face to a greater degree than most of us a cognitive constraint that confronts all of us: making decisions despite having limited time and ability to deal with information overload. Metrics are a tempting means of dealing with this “bounded rationality,” and engaging with matters beyond one’s comprehension.

20%

Imagine, for example, that you become the president of a large university, corporation, or cabinet department. You might, of course, rely on the informed opinion of experienced subordinates. But they are likely to have an intrinsic interest in the status quo: recall the dictum of the late poet and historian Robert Conquest—“Everyone is conservative about what they know best.”

21%

Principal-agent theory articulates in abstract terms the general suspicion that those employed in institutions are not to be trusted; that their activity must be monitored and measured; that those measures need to be transparent to those without firsthand knowledge of the institutions; and that pecuniary rewards and punishments are the most effective way to motivate “agents.”2 Here too, numbers are seen as a guarantee of objectivity, and as a replacement for intimate knowledge and personal trust.

26%

In 1993, President Bill Clinton signed the Government Performance and Results Act, which required all agencies to develop mission statements, long-range strategic plans, and annual performance goals, together with descriptions of the measures to be used to gauge progress toward those goals. Initiated by Republican legislators and signed by a Democratic president, the act enjoyed bipartisan support.12 In 2004, during the presidency of George W. Bush, the federal government’s venerable General Accounting Office was rechristened the Government Accountability Office.

30%

In an attempt to obtain “value,” successive British administrations have created a series of government agencies charged with evaluating the country’s universities, with titles such as the “Quality Assurance Agency.”18 There are audits of teaching quality, such as the “Teaching Quality Assessment,” evaluated largely on the extent to which various procedures are followed and paperwork filed, few of which have much to do with actual teaching.

30%

The effect is to increase costs or to divert spending from the doers to the administrators—which usually suits the latter just fine.

32%

In academia as elsewhere, that which gets measured gets gamed.

33%

Rankings create incentives for universities to become more like what the rankings measure. What gets measured is what gets attention. That leads to homogenization as they abandon their distinctive missions and become more like their competitors.

35%

Let us leave aside the accuracy and reliability of these metrics to explore a more important issue: the message conveyed by the metrics themselves. The College Scoreboard treats college education in purely economic terms: its sole concern is return on investment, understood as the relationship between the monetary costs of college and the increase in earnings that a degree will ultimately provide.

36%

Under the NCLB act, enacted early in Bush’s presidency, states were to test every student in grades 3–8 each year in math, reading, and science. The act was meant to bring all students to “academic proficiency”

37%

Campbell’s Law (explained in chapter 1) predicts: it destroys the predictive validity of the tests themselves. Tests of performance are designed to evaluate the knowledge and ability that students have acquired in their general education. When that education becomes focused instead on developing the students’ performance on the tests, the test no longer measures what it was created to evaluate. If, for example, class time is diverted to practicing multiple choice questions that resemble those on the test (perhaps by using questions from past tests), students may attain higher test scores—but ...more

39%

Such outcomes might lead one to conclude that the achievement gap cannot in fact be closed by education—and that the reasons lie beyond the schoolhouse door. Yet measuring continues unabated. That is perhaps because, as Banfield noted, the idea that some problems are insoluble is morally unacceptable to a substantial portion of educated Americans.

46%

Physician report cards create as many problems as they solve. Take the phenomenon of risk-aversion. Numerous studies have shown that cardiac surgeons became less willing to operate on severely ill patients in need of surgery after the introduction of publicly available metrics. In New York State, for example, the report cards for surgeons report on postoperative mortality rates for coronary bypass surgery, that is, what percentage of the patients operated upon remain alive thirty days after the procedure. After the metrics were instituted, the mortality rates did indeed decline—which seems ...more

47%

In addition to the tangible costs of gathering, inputting, and processing this tsunami of data, there are the incalculable opportunity costs of what doctors and other clinicians might have done with the time they must devote to inputting data. Moreover, the time invested is largely uncalculated and uncompensated. It typically falls out of consideration when medical costs are discussed.

47%

“Pay for performance” reduces intrinsic motivation. Many tasks, especially in health care, are potentially intrinsically satisfying. Relieving pain, answering questions, exercising manual dexterity, being confided in, working on a professional team, solving puzzles, and experiencing the role of a trusted authority—these are not at all bad ways to spend part of one’s day at work. Pride and joy in the work of caring is among the many motivations that do result in “performance” among health care professionals.

48%

As in the case of schools punished for the poor performance of their students on standardized tests, by penalizing the least successful hospitals, performance metrics may end up exacerbating inequalities in the distribution of resources—hardly a contribution to the public health they are supposed to improve.

49%

But metrics tend to be most successful for those interventions and outcomes that are almost entirely controlled by and within the organization’s medical system, as in the case of checklists of procedures to minimize central line–induced infections. When the outcomes are dependent upon more wide-ranging factors (such as patient behavior outside the doctor’s office and the hospital), they become more difficult to attribute to the efforts or failures of the medical system.

50%

“When targets are set by offices such as the Mayor’s Office for Policing and Crime, what they think they are asking for are 20% fewer victims. That translates into ‘record 20% fewer crimes’

54%

A great deal of corporate dysfunction comes from pay-for-performance schemes that are narrowly tailored to measure a single outcome.

63%

UNINTENDED BUT PREDICTABLE NEGATIVE CONSEQUENCES

63%

the purpose of social science was articulated in the nineteenth century by Auguste Comte: Savoir pour prévoir, prévoir pour prévenir (Know in order to predict, predict in order to avert [the previously unanticipated consequences of our actions]).

63%

Goal displacement through diversion of effort to what gets measured. Goal displacement comes in many varieties. When performance is judged by a few measures, and the stakes are high (keeping one’s job, getting a raise, raising the stock price at the time that stock options are vested), people will focus on satisfying those measures—often at the expense of other, more important organizational goals that are not measured.1 Economists Bengt Holmström and Paul Milgrom have described it in more formal terms as a problem of misaligned incentives: workers who are rewarded for the accomplishment of ...more

63%

Promoting short-termism. Measured performance encourages what Robert K. Merton called “the imperious immediacy of interests … where the actor’s paramount concern with the foreseen immediate consequences excludes consideration of further or other consequences.”3 In short, advancing short-term goals at the expense of long-range considerations.

63%

Costs in employee time. To the debit side of the ledger must also be added the transactional costs of metrics: the expenditure of employee time by those tasked with compiling and processing the metrics—not to speak of the time required to actually read them. That is exacerbated by the “reporting imperative”—the perceived need to constantly generate information, even when nothing significant is going on. Sometimes the metric of success i...

This highlight has been truncated due to consecutive passage length restrictions.

63%

Diminishing utility. Sometimes, newly introduced performance metrics will have immediate benefits in discovering poorly performing outliers.5 Having gleaned the low-hanging fruit, there is tendency to expect a continuingly bountiful harvest. The problem is that the metrics continue to get collected from everyone. And soon the marginal costs of assembling and analyzing the metrics exceed the marginal benefits.

64%

Rule cascades. In an attempt to staunch the flow of faulty metrics through gaming, cheating, and goal diversion, organizations institute a cascade of rules. Complying with them further slows down the institution’s functioning and diminishes its efficiency.

64%

Rewarding luck. Measuring outcomes when the people involved have little control over the results is tantamount to rewarding luck. It means that people are rewarded or penalized for outcomes that are actually independent of their efforts. ...

This highlight has been truncated due to consecutive passage length restrictions.

64%

Discouraging innovation. When people are judged by performance metrics, they are incentivized to do what the metrics measure, and what the metrics measure will be some established goal. But that impedes innovation, which means doing something that is not yet established, indeed hasn’t been tried out. Innovation involves experimentation. Trying out something new entails risk, including the possibility, perhaps probability, of failure.6 When performance metrics discourage risk they inadvertently promote stagnation.

64%

Discouraging cooperation and common purpose. Rewarding individuals for measured performance diminishes the sense of common purpose as well as the social relationships that provide the unmeasureable motivation for cooperation and institutional effectiveness.7 Reward based on measured performance tends to promote not cooperation but competition. If the individuals or units respond to the incentives created, rather than aiding, assisting, and advising one another, they strive to maximize their own metrics, ignoring, or even sabotaging, their fellows.

66%

Remember that, as we’ve seen, performance metrics that link reward and punishment may actually help reinforce intrinsic motivation when the goals to be rewarded accord with the professional goals of the practitioners.1 If, on the other hand, the scheme of reward and punishment is meant to elicit behavior that the practitioners consider useless or harmful, the metrics are more likely to be manipulated in the many ways we’ve explored.

67%

Ask why the people at the top of the organization are demanding performance metrics. As we’ve noted, the demand for performance measures sometimes flows from the ignorance of executives about the institutions they’ve been hired to manage, and that ignorance is often a result of parachuting into an organization with which one has little experience. Since experience and local knowledge matter, lean toward hiring from within. Even if there is someone smarter and more successful elsewhere, their lack of particular knowledge of your company, university, government agency, or other organization may ...more

67%

Measurements are more likely to be meaningful when they are developed from the bottom up, with input from teachers, nurses, and the cop on the beat.

See a Problem?

Preview — The Tyranny of Metrics by Jerry Z. Muller