More on this book
Community
Kindle Notes & Highlights
by
Cathy O'Neil
Read between
October 7, 2017 - January 15, 2018
the concept of fairness utterly escapes them. Programmers don’t know how to code for it, and few of their bosses ask them to.
So fairness isn’t calculated into WMDs. And the result is massive, industrial production of unfairness.
Justice cannot just be something that one part of society inflicts upon the other.
But prison systems, which are awash in data, do not carry out this highly important research. All too often they use data to justify the workings of the system but not to question or improve the system.
targets recidivism and encourages it.
Officials in Boston, for example, were considering using security cameras to scan thousands of faces at outdoor concerts. This data would be uploaded to a service that could match each face against a million others per second. In the end, officials decided against it. Concern for privacy, on that occasion, trumped efficiency. But this won’t always be the case.
Kyle Behm
proxies are bound to be inexact and often unfair.
in a 1971 case, Griggs v. Duke Power Company, that intelligence tests for hiring were discriminatory and therefore illegal.
research suggests that personality tests are poor predictors of job performance.
“The primary purpose of the test,” said Roland Behm, “is not to find the best employee. It’s to exclude as many people as possible as cheaply as possible.”
Note that there’s no option to answer “all of the above.” Prospective workers must pick one option, without a clue as to how the program will interpret it. And some of the analysis will draw unflattering conclusions.
Certain patterns of answers, however, can and do disqualify them. And we do not know what those patterns are. We’re not told what the tests are looking for. The process is entirely opaque.
Now imagine that Kyle Behm, after getting red-lighted at Kroger, goes on to land a job at McDonald’s. He turns into a stellar employee. He’s managing the kitchen within four months and the entire franchise a year later. Will anyone at Kroger go back to the personality test and investigate how they could have gotten it so wrong?
The company may be satisfied with the status quo, but the victims of its automatic systems suffer.
Job candidates, especially those applying for minimum-wage work, get rejected all the time and rarely find out why.
This is exactly what the Americans with Disabilities Act is supposed to prevent.
we’ve seen time and again that mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education. It’s up to society whether to use that intelligence to reject and punish them—or to reach out to them with the resources they need. We can use the scale and efficiency that make WMDs so pernicious in order to help people. It all depends on the objective we choose.
But the most problematic correlation had to do with geography. Job applicants who lived farther from the job were more likely to churn. This makes sense: long commutes are a pain. But Xerox managers noticed another correlation. Many of the people suffering those long commutes were coming from poor neighborhoods. So Xerox, to its credit, removed that highly correlated churn data from its model. The company sacrificed a bit of efficiency for fairness.
clopening.
The goal, of course, is to spend as little money as possible, which means keeping staffing at the bare minimum while making sure that reinforcements are on hand for the busy times.
How do workersc fught back without unionized data? This is why we ahould have fractuonal healthcare support if not universal healthcare.
The money saved, naturally, comes straight from employees’ pockets.
But instead of lawn mower blades or cell phone screens showing up right on cue, it’s people, usually people who badly need money. And because they need money so desperately, the companies can bend their lives to the dictates of a mathematical model.
Scheduling software also creates a poisonous feedback loop. Consider Jannette Navarro. Her haphazard scheduling made it impossible for her to return to school, which dampened her employment prospects and kept her in the oversupplied pool of low-wage workers. The long and irregular hours also make it hard for workers to organize or to protest for better conditions. Instead, they face heightened anxiety and sleep deprivation, which causes dramatic mood swings and is responsible for an estimated 13 percent of highway deaths. Worse yet, since the software is designed to save companies money, it
...more
scientists need this error feedback—in this case the presence of false negatives—to delve into forensic analysis and figure out what went wrong, what was misread, what data was ignored.
In 1983, the Reagan administration issued a lurid alarm about the state of America’s schools. In a report called A Nation at Risk, a presidential panel warned that a “rising tide of mediocrity” in the schools threatened “our very future as a Nation and a people.” The report added that if “an unfriendly foreign power” had attempted to impose these bad schools on us, “we might well have viewed it as an act of war.”
actually two other teachers who scored below me in my school. That emboldened me to share my results, because I wanted those teachers to know it wasn’t only them.”
The value-added model had given him a failing grade but no advice on how to improve
statistics, this phenomenon is known as Simpson’s Paradox: when a whole body of data displays one trend, yet when broken into subgroups, the opposite trend comes into view for each of those subgroups.
And then along came an algorithm, and things improved. A mathematician named Earl Isaac and his engineer friend, Bill Fair, devised a model they called FICO to evaluate the risk that an individual would default on a loan.
Much of the predatory advertising we’ve been discussing, including the ads for payday loans and for-profit colleges, is generated through such e-scores.
since companies are legally prohibited from using credit scores for marketing purposes, they make do with this sloppy substitute.
the consequence is that companies end up diving into largely unregulated pools of data, such as clickstreams and geo-tags, in order to create a parallel data marketplace. In the process, they can largely avoid government oversight.
E-scores, by contrast, march us back in time. They analyze the individual through a veritable blizzard of proxies. In a few milliseconds, they carry out thousands of “people like you” calculations. And if enough of these “similar” people turn out to be deadbeats or, worse, criminals, that individual will be treated accordingly.
should note that in the statistical universe proxies inhabit, they often work. More times than not, birds of a feather do fly together. Rich people buy cruises and BMWs. All too often, poor people need a payday loan. And since these statistical models appear to work much of the time, efficiency rises and profits surge. Investors double down on scientific systems that can place thousands of people into what appear to be the correct buckets. It’s the triumph of Big Data.
The practice of using credit scores in hirings and promotions creates a dangerous poverty cycle.
ten states have passed legislation to outlaw the use of credit scores in hiring.
Computer-generated terrorism no-fly lists, for example, are famously rife with errors. An innocent person whose name resembles that of a suspected terrorist faces a hellish ordeal every time he has to get on a plane. (Wealthy travelers, by contrast, are often able to pay to acquire “trusted traveler” status, which permits them to waltz through security. In effect, they’re spending money to shield themselves from a WMD.)
Some data brokers, no doubt, are more dependable than others. But any operation that attempts to profile hundreds of millions of people from thousands of different sources is going to get a lot of the facts wrong.
American Express learned this the hard way in 2009, just as the Great Recession was gearing up. No doubt looking to reduce risk on its own balance sheet, Amex cut the spending limits of some customers.
Many of these cardholders, it’s safe to say, frequented “stores associated with poor repayments” because they weren’t swimming in money. And wouldn’t you know it? An algorithm took notice and made them poorer.
Merrill proclaims that “all data is credit data.”
Lending Club and its chief rival, Prosper, are still tiny. They’ve generated less than $10 billion in loans, which is but a speck in the $3 trillion consumer lending market. Yet they’re attracting loads of attention. Executives from Citigroup and Morgan Stanley serve as directors of peer-to-peer players, and Wells Fargo’s investment fund is the largest investor in Lending Club. Lending Club’s stock offering in December of 2014 was the biggest tech IPO of the year. It raised $870 million and reached a valuation of $9 billion, making it the fifteenth most valuable bank in America.
In fact, compared to the slew of WMDs running amok, the prejudiced loan officer of yesteryear doesn’t look all that bad.
Frederick Hoffman created a potent WMD. It’s very likely that Hoffman, a German who worked for the Prudential Life Insurance Company, meant no harm. Later in his life, his research contributed mightily to public health. He did valuable work on malaria and was among the first to associate cancer with tobacco. Yet on a spring day in 1896, Hoffman published a 330-page report that set back the cause of racial equality in the United States and reinforced the status of millions as second-class citizens. His report used exhaustive statistics to make the case that the lives of black Americans were so
...more
One was a draper in London named John Graunt. He went through birth and death records and in 1682 came up with the first study of the mortality rates of an entire community of people.