Beau Gunderson’s Kindle Notes & Highlights for The Tyranny of Metrics

It is also, in its way, a business management book. But as I came to realize after finishing the book, it is a book about management as seen in good part from the perspective of the managed, which sets it off from most books in that genre.

4%

Trends in contemporary advertising illustrate the phenomenon of “measurability bias,” the tendency to prefer options simply because they can more easily be measured. Rather than engage in brand development by advertising in a wide range of media, for example, companies increasingly prefer to advertise only in venues that provide “direct response” in the form of clicking on links, on the grounds that these can be measured, while the effect of billboards, television advertisements or newspaper ads cannot.

5%

When numbers, standardized measurement of performance, and big data are seen as the wave of the future, professional judgment based upon experience and talent are seen as retrograde, almost anachronistic.

5%

Add to that the lure of information technology. The growing opportunities to collect data, and the declining cost of doing so, contribute to the belief that data is the answer, for which organizations have to come up with questions. There is an unexamined faith that amassing metric data and sharing it widely within an organization will result in improvements of some sort. So who needs judgment based upon experience and talent? The contention of this book is that you do.

7%

There are things that can be measured. There are things that are worth measuring. But what can be measured is not always what is worth measuring; what gets measured may have no relationship to what we really want to know. The costs of measuring may be greater than the benefits. The things that get measured may draw effort away from the things we really care about. And measurement may provide us with distorted knowledge—knowledge that seems solid but is actually deceptive.

7%

When their scores are used as a basis of reward and punishment, surgeons, as do others under such scrutiny, engage in creaming, that is, they avoid the riskier cases. When hospitals are penalized based on the percentage of patients who fail to survive for thirty days beyond surgery, patients are sometimes kept alive for thirty-one days, so that their mortality is not reflected in the hospital’s metrics.2 In England, in an attempt to reduce wait times in emergency wards, the Department of Health adopted a policy that penalized hospitals with wait times longer than four hours. The program ...more

8%

The most characteristic feature of metric fixation is the aspiration to replace judgment based on experience with standardized measurement.

8%

Concrete interests of power, money, and status are at stake. Metric fixation leads to a diversion of resources away from frontline producers toward managers, administrators, and those who gather and manipulate data.

11%

The key components of metric fixation are ■ the belief that it is possible and desirable to replace judgment, acquired by personal experience and talent, with numerical indicators of comparative performance based upon standardized data (metrics); ■ the belief that making such metrics public (transparent) assures that institutions are actually carrying out their purposes (accountability); ■ the belief that the best way to motivate people within these organizations is by attaching rewards and penalties to their measured performance, rewards that are either monetary (pay-for-performance) or ...more

This highlight has been truncated due to consecutive passage length restrictions.

12%

In the process, the nature of work is transformed in ways that are often pernicious. Professionals tend to resent the impositions of goals that may conflict with their vocational ethos and judgment, and thus morale is lowered. Almost inevitably, many people become adept at manipulating performance indicators through a variety of methods, many of which are ultimately dysfunctional for their organizations. They fudge the data or deal only with cases that will improve performance indicators. They fail to report negative instances. In extreme cases, they fabricate the evidence.

12%

Whenever reward is tied to measured performance, metric fixation invites gaming.

12%

What has come to be called “Campbell’s Law,” named for the American social psychologist Donald T. Campbell, holds that “[t]he more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”

12%

Trying to force people to conform their work to preestablished numerical goals tends to stifle innovation and creativity—valuable qualities in most settings. And it almost inevitably leads to a valuation of short-term goals over long-term purposes.

12%

In situations where there are no real feasible solutions to a problem, the gathering and publication of performance data serves as a form of virtue signaling. There is no real progress to show, but the effort demonstrated in gathering and publicizing the data satisfies a sense of moral earnestness. In lieu of real progress, the progress of measurement becomes a simulacrum of success.

13%

All of that is not intended to claim that measurement is useless or intrinsically pernicious. One of the purposes of this book is to specify when performance metrics are genuinely useful—how to use metrics without the characteristic dysfunctions of metric fixation.

13%

Measuring the most easily measurable. There is a natural human tendency to try to simplify problems by focusing on the most easily measureable elements.1 But what is most easily measured is rarely what is most important, indeed sometimes not important at all. That is the first source of metric dysfunction. Closely related is measuring the simple when the desired outcome is complex. Most jobs have multiple responsibilities and most organizations have multiple goals. Focusing measurement on just one responsibility or goal often leads to deceptive results. Measuring inputs rather than outcomes. ...more

14%

Gaming through creaming. This takes place when practitioners find simpler targets or prefer clients with less challenging circumstances, making it easier to reach the metric goal, but excluding cases where success is more difficult to achieve. Improving numbers by lowering standards. One way of improving metric scores is by lowering the criteria for scoring. Thus, for example, graduation rates of high schools and colleges can be increased by lowering the standards for passing. Or airlines improve their on-time performance by increasing the scheduled flying time of their flights. Improving ...more

15%

Workers who carried out their tasks more slowly than the prescribed time were paid at a lower rate per unit of output; those who met the expectation were rewarded at a higher rate. Taylor also advocated an elaborate system for monitoring and controlling the workplace.10 His goal was to increase efficiency by standardizing and speeding up work on the factory floor to create mass production. Specialization and standardization of tasks, recording and reporting of all activity, pecuniary carrots and sticks—these were the legacy of Taylor and his disciples to subsequent generations. Taylorism was ...more

16%

Decisions based on numbers were viewed as scientific, since numbers were thought to imply objectivity and accuracy.18 Management theorists and gurus who dispensed this new wisdom ascended to the office once ascribed by Shelley to poets as “the unacknowledged legislators of mankind.”19 Before that, “expertise” meant the career-long accumulation of knowledge of a specific field, as one progressed from rung to rung within the same institution or business—accumulating what economists call “task-specific know-how.” Auto executives were “car guys”—men who had spent much of their professional life in ...more

17%

“Under the guidance of civilian officials—many of whom care little about their ignorance of strategy, operational craft, and tactics, and present themselves as managers capable of managing all things regardless of their content—the military establishment itself long ago accepted the pursuit of business efficiency as its supreme goal.”

17%

That led to what Luttwak called a “materialist bias,” aimed at measuring inputs and tangible outputs (such as firepower), rather than intangible human factors, such as strategy, leadership, group cohesion, and the morale of servicemen.26 What could be precisely measured tended to overshadow what was really important.

18%

Numbers are regarded as “hard,” and thus a safer bet for those disposed to doubt their own judgments.

19%

In a vicious circle, a lack of social trust leads to the apotheosis of metrics, and faith in metrics contributes to a declining reliance upon judgment.

19%

In a series of books, Philip K. Howard has argued that the decline of trust leads to a new mindset in which “[a]voiding human choice in public decisions is not just a theory … but a kind of theology…. Human choice is considered too dangerous.” As a consequence, “Officials no longer are allowed to act on their best judgment”4 or to exercise discretion, which is judgment about what the particular situation requires.5 The result is overregulation: an ever tighter web of rules, including the proliferation of rules within organizations.6 Often enough, metrics provides the tools for tightening that ...more

19%

In one field after another, the introduction of greater measurement in the name of accountability did shine light upon real problems, including variations in professional practice that were supposedly grounded in “science,” and gaps in performance that had previously gone unnoticed or undocumented. The impact of these revelations both diminished faith in professional judgment and created pressure to find solutions, solutions thought to entail greater measurement in order to monitor the professionals whose ethos had been cast into doubt.

20%

Those at the top face to a greater degree than most of us a cognitive constraint that confronts all of us: making decisions despite having limited time and ability to deal with information overload. Metrics are a tempting means of dealing with this “bounded rationality,” and engaging with matters beyond one’s comprehension.

20%

The demands for a constant stream of reports and standardized data have the effect, intended or inadvertent, of diminishing the autonomy of those lower in the organizational hierarchy—whose doubts about metrics-based innovations are dismissed as irrational or as a self-interested “resistance to change.”

21%

The spreadsheet is a tool, but it is also a worldview—reality by the numbers…. Because spreadsheets can do so many important things, those who use them tend to lose sight of the crucial fact that the imaginary businesses that they can create on their computers are just that—imaginary. You can’t really duplicate a business inside a computer, just aspects of a business. And since numbers are the strength of spreadsheets, the aspects that get emphasized are the ones easily embodied in numbers. Intangible factors aren’t so easily quantified.

21%

The version of the theory prominent in the management literature calls attention to the gap between the purposes of institutions and the people who run them and are employed by them. It focuses on the problem of aligning the interests of shareholders in maximum profitability and stock price with the interests of corporate executives, whose priorities might diverge from those goals. Principal-agent theory articulates in abstract terms the general suspicion that those employed in institutions are not to be trusted; that their activity must be monitored and measured; that those measures need to ...more

23%

Many of the problems of pay-for-performance schemes can be traced to an overly simple, indeed deeply distortive, conception of human motivation, one that assumes that people are motivated to work only by material rewards. For some are motivated less by extrinsic monetary rewards than by various sorts of intrinsic psychic rewards, including their commitment to the goals of the organizations for which they work, or a fascination with the complexity of the work they do, which makes it challenging, interesting, and entertaining.

23%

Some rewards enhance intrinsic motivation. For example, when the rewards are verbal and expressed primarily to convey information (“You did a great job on that!”) rather than to exercise control.9 Or when awards are given out after the fact, for excellence in achievement, without having been offered as an incentive in advance.10 Or, in fields such as science or scholarship, when prizes or honorific titles are bestowed to recognize long-term achievement.11 More broadly, above-market wages can reinforce employees’ intrinsic motivation if those wages are perceived as a signal of the ...more

24%

But when mission-oriented organizations try to use extrinsic rewards, as in promises of pay-for-performance, the result may actually be counterproductive. The use of extrinsic rewards for activities of high intrinsic interest leads people to focus on the rewards and not on the intrinsic interest of the task, or on the larger mission of which it is a part. The result is a “crowding out” of intrinsic motivation: having been taught to think of their work tasks primarily as a means toward monetary goals, they lose interest in doing the work for the sake of the larger mission of the institution.

24%

Robert Gibbons, a professor of organizational economics at MIT, pointed out that in fact the principal (the owner of the firm, for example) profits from a variety of outputs from the agent (the employee), and that many of these outputs are not highly visible or measureable in any numerical sense. Organizations depend on employees engaging in mentoring and in team work, for example, which are often at odds with what the employees would do if their only interests were to maximize their measured performance for purposes of compensation. Thus, there is a gap between the measureable contribution ...more

26%

metrics “inhibits risk-taking, an inevitable concomitant of exploration and creativity. We are less likely to take chances, to play with possibilities, and to follow hunches, which may, after all, not pay off.”

26%

A hallmark of practical, local knowledge, as James Scott has noted, is that “it is as economical and accurate as it needs to be, no more and no less, for addressing the problem at hand.”6 By contrast, the degree of numerical precision promised by metrics may be far greater than is required by actual practitioners, and attaining that precision requires an expenditure of time and effort that may not be worthwhile. The quest for precision may therefore be wasteful, and resented for that reason by those required to sacrifice their time and ingenuity.

26%

judgment is a sort of skill at grasping the unique particularities of a situation, and it entails a talent for synthesis rather than analysis, “a capacity for taking in the total pattern of a human situation, of the way in which things hang together.”7 A feel for the whole and a sense for the unique are precisely what numerical metrics cannot supply.

41%

At the same time as pressure to control costs is escalating, the new technology of electronic health records has made the collection of medical data more readily obtainable, creating a temptation to exploit the data to identify problems.

42%

Perhaps the most popular trend in American health policy is the promotion of performance metrics, accountability, and transparency. Measured performance is supposed to allow practitioners to better assess clinical practices and to track their implementation; allow insurers to reward success and penalize failure; and through ratings and report cards, create transparency in ways that will allow patients to make more informed choices about medical providers.

43%

The Keystone project includes gathering monthly data on infection rates, which are reported to the leaders of intensive care units and to top hospital officials. The results are discussed with the larger staff, with an eye to learning from mistakes. This is an instance of diagnostic metrics. It provides data that can be used by a practitioner (physician), or internally within an institution (hospital), or shared among practitioners and institutions to discover what is working and what is not, and to use that information to improve performance.

44%

“the men and women who actually work in the service lines themselves chose which care processes to change. Involving them directly in decision making secured their buy-in and made success more likely.”

44%

What we can learn from the Geisinger example is the importance of having providers develop and monitor performance measures. The fact that the measures were in keeping with their own professional sense of mission was crucial.

46%

There is now a large social scientific literature on the impact of pay-for-performance and public performance metrics in the United States, the United Kingdom, and elsewhere. What is quite astonishing is how often these techniques—so obviously effective according to economic theory—have no discernable effect on outcomes.

46%

“We found that public reporting of mortality rates has had no impact on patient outcomes. We looked at every subgroup. We even examined those that were labeled as bad performers to see if they would improve more quickly. They didn’t. In fact, if you were going to be faithful to the data, you would conclude that public reporting slowed down the rate of improvement in patient outcomes.”

46%

Among the intrinsic problems of P4P and public rankings are goal diversion.

46%

The British P4P program led to lower quality of care for those medical conditions that were not part of the program.

47%

The phenomenon of risk-aversion means that some patients whose lives might be saved by a risky operation are simply never operated upon. But there is also the reverse problem, that of overly aggressive care to meet metric targets. Patients whose operations are not successful may be kept alive for the requisite thirty days to improve their hospital’s mortality data, a prolongation that is both costly and inhumane.

47%

Just how costly and burdensome the pursuit of ever more medical metrics has become is evident in a recent report from the Institute of Medicine.28 At major medical centers, the cost of reporting quality measures to government regulators and insurers amounted to 1 percent of net revenue. Administrative costs for measurement and related activities are estimated at $190 billion per year. Then there is the unmeasureable cost of providers entering data into the government’s Patient Quality Reporting Systems. Larger medical practices must pay external firms to enter the data; in smaller practices, ...more

47%

Add to this the psychic costs of treating medicine as if it were primarily a profit-making enterprise. Berwick captured this brilliantly in his article, “The Toxicity of Pay for Performance”: “Pay for performance” reduces intrinsic motivation. Many tasks, especially in health care, are potentially intrinsically satisfying. Relieving pain, answering questions, exercising manual dexterity, being confided in, working on a professional team, solving puzzles, and experiencing the role of a trusted authority—these are not at all bad ways to spend part of one’s day at work. Pride and joy in the work ...more

48%

Hospital readmissions have indeed declined, a much-touted success for performance metrics. But how much of that success is real? The falling rate of reported readmissions was due in part to gaming the system: instead of formally admitting returning patients, hospitals placed them on “observation status,” under which the patient stays in the hospital for a period of time (up to several days), and is billed for outpatient services rather than an inpatient “admission.” Alternatively, the returning patients were treated in the emergency room. Between 2006 and 2013, such observation stays for ...more

48%

As of 2015, about three-quarters of the reporting hospitals were penalized by Medicare. Tellingly, major teaching hospitals—which tend to see more difficult patients—were disproportionately affected.34 So were hospitals in poverty-stricken areas, where patients were less likely to be well taken care of (or to take care of themselves) after their initial discharge from the hospital.35 Attaining the goal of reduced admissions depends not only on the steps that the hospital takes to educate the patient and provide necessary medications, but also on many factors over which the hospital has little ...more

See a Problem?

Preview — The Tyranny of Metrics by Jerry Z. Muller