More on this book
Community
Kindle Notes & Highlights
by
Cathy O'Neil
Read between
February 13 - March 27, 2022
The privileged, we’ll see time and again, are processed more by people, the masses by machines.
Ill-conceived mathematical models now micromanage the economy, from advertising to prisons.
But there’s one important distinction between a school district’s value-added model and, say, a WMD that scouts out prospects for extortionate payday loans. They have different payoffs. For the school district, the payoff is a kind of political currency, a sense that problems are being fixed. But for businesses it’s just the standard currency: money. For many of the businesses running these rogue algorithms, the money pouring in seems to prove that their models are working. Look at it through their eyes and it makes sense. When they’re building statistical systems to find customers or
...more
This may sound obvious, but as we’ll see throughout this book, the folks building WMDs routinely lack data for the behaviors they’re most interested in. So they substitute stand-in data, or proxies. They draw statistical correlations between a person’s zip code or language patterns and her potential to pay back a loan or handle a job. These correlations are discriminatory, and some of them are illegal. Baseball models, for the most part, don’t use proxies because they use pertinent inputs like balls, strikes, and hits.
A model’s blind spots reflect the judgments and priorities of its creators.
Models are opinions embedded in mathematics.
But as the questions continue, delving deeper into the person’s life, it’s easy to imagine how inmates from a privileged background would answer one way and those from tough inner-city streets another. Ask a criminal who grew up in comfortable suburbs about “the first time you were ever involved with the police,” and he might not have a single incident to report other than the one that brought him to prison. Young black males, by contrast, are likely to have been stopped by police dozens of times, even when they’ve done nothing wrong. A 2013 study by the New York Civil Liberties Union found
...more
But even if we put aside, ever so briefly, the crucial issue of fairness, we find ourselves descending into a pernicious WMD feedback loop. A person who scores as “high risk” is likely to be unemployed and to come from a neighborhood where many of his friends and family have had run-ins with the law. Thanks in part to the resulting high score on the evaluation, he gets a longer sentence, locking him away for more years in a prison where he’s surrounded by fellow criminals—which raises the likelihood that he’ll return to prison. He is finally released into the same poor neighborhood, this time
...more
Transparency matters. And yet many companies go out of their way to hide the results of their models or even their existence. One common justification is that the algorithm constitutes a “secret sauce” crucial to their business. It’s intellectual property, and it must be defended, if need be, with legions of lawyers and lobbyists. In the case of web giants like Google, Amazon, and Facebook, these precisely tailored algorithms alone are worth hundreds of billions of dollars. WMDs are, by design, inscrutable black boxes. That makes it extra hard to definitively answer the second question: Does
...more
And finally, you might note that not all of these WMDs are universally damaging. After all, they send some people to Harvard, line others up for cheap loans or good jobs, and reduce jail sentences for certain lucky felons. But the point is not whether some people benefit. It’s that so many suffer. These models, powered by algorithms, slam doors in the face of millions of people, often for the flimsiest of reasons, and offer no appeal. They’re unfair.
As with so many WMDs, the math was directed against the consumer as a smoke screen. Its purpose was only to optimize short-term profits for the sellers. And those sellers trusted that they’d manage to unload the securities before they exploded. Smart people would win. And dumber people, the providers of dumb money, would wind up holding billions (or trillions) of unpayable IOUs. Even rigorous mathematicians—and there were a few—were working with numbers provided by people carrying out wide-scale fraud. Very few people had the expertise and the information required to know what was actually
...more
By 2009, it was clear that the lessons of the market collapse had brought no new direction to the world of finance and had instilled no new values. The lobbyists succeeded, for the most part, and the game remained the same: to rope in dumb money. Except for a few regulations that added a few hoops to jump through, life went on.
I saw all kinds of parallels between finance and Big Data. Both industries gobble up the same pool of talent, much of it from elite universities like MIT, Princeton, or Stanford. These new hires are ravenous for success and have been focused on external metrics—like SAT scores and college admissions—their entire lives. Whether in finance or tech, the message they’ve received is that they will be rich, that they will run the world. Their productivity indicates that they’re on the right track, and it translates into dollars. This leads to the fallacious conclusion that whatever they’re doing to
...more
I wondered what the analogue to the credit crisis might be in Big Data. Instead of a bust, I saw a growing dystopia, with inequality rising. The algorithms would make sure that those deemed losers would remain that way. A lucky minority would gain ever more control over the data economy, raking in outrageous fortunes and convincing themselves all the while that they deserved it.
As people game the system, the proxy loses its effectiveness. Cheaters wind up as false positives.
The response to this crackdown on cheating was volcanic. Some two thousand stone-throwing protesters gathered in the street outside the school. They chanted, “We want fairness. There is no fairness if you don’t let us cheat.” It sounds like a joke, but they were absolutely serious. The stakes for the students were sky high. As they saw it, they faced a chance either to pursue an elite education and a prosperous career or to stay stuck in their provincial city, a relative backwater. And whether or not it was the case, they had the perception that others were cheating. So preventing the students
...more
The victims, of course, are the vast majority of Americans, the poor and middle-class families who don’t have thousands of dollars to spent on courses and consultants. They miss out on precious insider knowledge. The result is an education system that favors the privileged. It tilts against needy students, locking out the great majority of them—and pushing them down a path toward poverty. It deepens the social divide.
Once the ignorance is established, the key for the recruiter, just as for the snake-oil merchant, is to locate the most vulnerable people and then use their private information against them. This involves finding where they suffer the most, which is known as the “pain point.” It might be low self-esteem, the stress of raising kids in a neighborhood of warring gangs, or perhaps a drug addiction. Many people unwittingly disclose their pain points when they look for answers on Google or, later, when they fill out college questionnaires. With that valuable nugget in hand, recruiters simply promise
...more
Once a student has indicated in an online questionnaire that she’ll need financial aid, the for-profit colleges pop up at the top of her list of matching schools.
For-profit colleges also provide free services in exchange for face time with students. Cassie Magesis, another readiness counselor at the Urban Assembly, told me that the colleges provide free workshops to guide students in writing their résumés. These sessions help the students. But impoverished students who provide their contact information are subsequently stalked. The for-profit colleges do not bother targeting rich students. They and their parents know too much.
Once the nuisance data flows into a predictive model, more police are drawn into those neighborhoods, where they’re more likely to arrest more people. After all, even if their objective is to stop burglaries, murders, and rape, they’re bound to have slow periods. It’s the nature of patrolling. And if a patrolling cop sees a couple of kids who look no older than sixteen guzzling from a bottle in a brown bag, he stops them. These types of low-level crimes populate their models with more and more dots, and the models send the cops back to the same neighborhood. This creates a pernicious feedback
...more
We have every reason to believe that more such crimes are occurring in finance right now. If we’ve learned anything, it’s that the driving goal of the finance world is to make a huge profit, the bigger the better, and that anything resembling self-regulation is worthless. Thanks largely to the industry’s wealth and powerful lobbies, finance is underpoliced.
The cops don’t have the expertise for that kind of work. Everything about their jobs, from their training to their bullet-proof vests, is adapted to the mean streets. Clamping down on white-collar crime would require people with different tools and skills. The small and underfunded teams who handle that work, from the FBI to investigators at the Securities and Exchange Commission, have learned through the decades that bankers are virtually invulnerable. They spend heavily on our politicians, which always helps, and are also viewed as crucial to our economy. That protects them. If their banks
...more
The result is that we criminalize poverty, believing all the while that our tools are not only scientific but fair.
While looking at WMDs, we’re often faced with a choice between fairness and efficacy. Our legal traditions lean strongly toward fairness. The Constitution, for example, presumes innocence and is engineered to value it. From a modeler’s perspective, the presumption of innocence is a constraint, and the result is that some guilty people go free, especially those who can afford good lawyers. Even those found guilty have the right to appeal their verdict, which chews up time and resources. So the system sacrifices enormous efficiencies for the promise of fairness. The Constitution’s implicit
...more
This may sound less than serious. But a crucial part of justice is equality. And that means, among many other things, experiencing criminal justice equally. People who favor policies like stop and frisk should experience it themselves. Justice cannot just be something that one part of society inflicts upon the other.
In this system, the poor and nonwhite are punished more for being who they are and living where they live.
From a mathematical point of view, however, trust is hard to quantify. That’s a challenge for people building models. Sadly, it’s far simpler to keep counting arrests, to build models that assume we’re birds of a feather and treat us as such. Innocent people surrounded by criminals get treated badly, and criminals surrounded by a law-abiding public get a pass. And because of the strong correlation between poverty and reported crime, the poor continue to get caught up in these digital dragnets. The rest of us barely have to think about them.
“The primary purpose of the test,” said Roland Behm, “is not to find the best employee. It’s to exclude as many people as possible as cheaply as possible.”
Defenders of the tests note that they feature lots of questions and that no single answer can disqualify an applicant. Certain patterns of answers, however, can and do disqualify them. And we do not know what those patterns are. We’re not told what the tests are looking for. The process is entirely opaque.
This is a point I’ll be returning to in future chapters: we’ve seen time and again that mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education. It’s up to society whether to use that intelligence to reject and punish them—or to reach out to them with the resources they need. We can use the scale and efficiency that make WMDs so pernicious in order to help people. It all depends on the objective we choose.
The money saved, naturally, comes straight from employees’ pockets. Under the inefficient status quo, workers had not only predictable hours but also a certain amount of downtime. You could argue that they benefited from inefficiency: some were able to read on the job, even study. Now, with software choreographing the work, every minute should be busy. And these minutes will come whenever the program demands it, even if it means clopening from Friday to Saturday.
The root of the trouble, as with so many other WMDs, is the modelers’ choice of objectives. The model is optimized for efficiency and profitability, not for justice or the good of the “team.” This is, of course, the nature of capitalism. For companies, revenue is like oxygen. It keeps them alive. From their perspective, it would be profoundly stupid, even unnatural, to turn away from potential savings. That’s why society needs countervailing forces, such as vigorous press coverage that highlights the abuses of efficiency and shames companies into doing the right thing. And when they come up
...more
Credit card companies such as Capital One carry out similar rapid-fire calculations as soon as someone shows up on their website. They can often access data on web browsing and purchasing patterns, which provide loads of insights about the potential customer. Chances are, the person clicking for new Jaguars is richer than the one checking out a 2003 Taurus on Carfax.com. Most scoring systems also pick up the location of the visitor’s computer. When this is matched with real estate data, they can draw inferences about wealth. A person using a computer on San Francisco’s Balboa Terrace is a far
...more
After all, our credit history includes highly personal data, and it makes sense that we should have control over who sees it. But the consequence is that companies end up diving into largely unregulated pools of data, such as clickstreams and geo-tags, in order to create a parallel data marketplace. In the process, they can largely avoid government oversight. They then measure success by gains in efficiency, cash flow, and profits. With few exceptions, concepts like justice and transparency don’t fit into their algorithms.
In his search for financial responsibility, the banker could have dispassionately studied the numbers (as some exemplary bankers no doubt did). But instead he drew correlations to race, religion, and family connections. In doing so, he avoided scrutinizing the borrower as an individual and instead placed him in a group of people—what statisticians today would call a “bucket.” “People like you,” he decided, could or could not be trusted. Fair and Isaac’s great advance was to ditch the proxies in favor of the relevant financial data, like past behavior with respect to paying bills. They focused
...more
In other words, the modelers for e-scores have to make do with trying to answer the question “How have people like you behaved in the past?” when ideally they would ask, “How have you behaved in the past?”
Good credit, they argue, is an attribute of a responsible person, the kind they want to hire. But framing debt as a moral issue is a mistake. Plenty of hardworking and trustworthy people lose jobs every day as companies fail, cut costs, or move jobs offshore. These numbers climb during recessions. And many of the newly unemployed find themselves without health insurance. At that point, all it takes is an accident or an illness for them to miss a payment on a loan. Even with the Affordable Care Act, which reduced the ranks of the uninsured, medical expenses remain the single biggest cause of
...more
After all, “The more data, the better” is the guiding principle of the Information Age. Yet in the name of fairness, some of this data should remain uncrunched.
Humans in the data economy are outliers and throwbacks. The systems are built to run automatically as much as possible. That’s the efficient way; that’s where the profits are. Errors are inevitable, as in any statistical program, but the quickest way to reduce them is to fine-tune the algorithms running the machines.
That may sound a tad cynical. But consider the price optimization algorithm at Allstate, the insurer self-branded as “the Good Hands People.” According to a watchdog group, the Consumer Federation of America, Allstate analyzes consumer and demographic data to determine the likelihood that customers will shop for lower prices. If they aren’t likely to, it makes sense to charge them more. And that’s just what Allstate does.
In the world of WMDs, privacy is increasingly a luxury that only the wealthy can afford.
Insurance is an industry, traditionally, that draws on the majority of the community to respond to the needs of an unfortunate minority. In the villages we lived in centuries ago, families, religious groups, and neighbors helped look after each other when fire, accident, or illness struck. In the market economy, we outsource this care to insurance companies, which keep a portion of the money for themselves and call it profit.
My point is that oceans of behavioral data, in coming years, will feed straight into artificial intelligence systems. And these will remain, to human eyes, black boxes. Throughout this process, we will rarely learn about the tribes we “belong” to or why we belong there. In the era of machine intelligence, most of the variables will remain a mystery. Many of those tribes will mutate hour by hour, even minute by minute, as the systems shuttle people from one group to another. After all, the same person acts very differently at 8 a.m. and 8 p.m. These automatic programs will increasingly
...more
Nearly one dollar of every five we earn feeds the vast health care industry.
Once companies amass troves of data on employees’ health, what will stop them from developing health scores and wielding them to sift through job candidates? Much of the proxy data collected, whether step counts or sleeping patterns, is not protected by law, so it would theoretically be perfectly legal. And it would make sense. As we’ve seen, they routinely reject applicants on the basis of credit scores and personality tests. Health scores represent a natural—and frightening—next step.
But isn’t it a good thing, wellness advocates will ask, to help people deal with their weight and other health issues? The key question is whether this help is an offer or a command. If companies set up free and voluntary wellness programs, few would have reason to object. (And workers who opt in to such programs do, in fact, register gains, though they might well have done so without them.) But tying a flawed statistic like BMI to compensation, and compelling workers to mold their bodies to the corporation’s ideal, infringes on freedom. It gives companies an excuse to punish people they don’t
...more
I post a petition on my Facebook page. Which of my friends will see it on their news feed? I have no idea. As soon as I hit send, that petition belongs to Facebook, and the social network’s algorithm makes a judgment about how to best use it. It calculates the odds that it will appeal to each of my friends. Some of them, it knows, often sign petitions, and perhaps share them with their own networks. Others tend to scroll right past. At the same time, a number of my friends pay more attention to me and tend to click the articles I post. The Facebook algorithm takes all of this into account as
...more
Other publicly held corporations, including Google, Apple, Microsoft, Amazon, and cell phone providers like Verizon and AT&T, have vast information on much of humanity—and the means to steer us in any way they choose. Usually, as we’ve seen, they’re focused on making money. However, their profits are tightly linked to government policies. The government regulates them, or chooses not to, approves or blocks their mergers and acquisitions, and sets their tax policies (often turning a blind eye to the billions parked in offshore tax havens). This is why tech companies, like the rest of corporate
...more
The Facebook campaign started out with a constructive and seemingly innocent goal: to encourage people to vote. And it succeeded. After comparing voting records, researchers estimated that their campaign had increased turnout by 340,000 people. That’s a big enough crowd to swing entire states, and even national elections. George W. Bush, after all, won in 2000 by a margin of 537 votes in Florida. The activity of a single Facebook algorithm on Election Day, it’s clear, could not only change the balance of Congress but also decide the presidency.