More on this book
Community
Kindle Notes & Highlights
by
Cathy O'Neil
Read between
March 29 - April 5, 2025
If we think about human resources policies at IBM and other companies as algorithms, they codified discrimination for decades. The move to equalize benefits nudged them toward fairness.
Unfortunately, there’s a glaring difference. Gay rights benefited in many ways from market forces. There was a highly educated and increasingly vocal gay and lesbian talent pool that companies were eager to engage. So they optimized their models to attract them. But they did this with the focus on the bottom line. Fairness, in most cases, was a by-product.
But WMDs generating fabulous profit margins are not likely to remain cloistered for long in the lower ranks. That’s not the way markets work.
But human decision making, while often flawed, has one chief virtue. It can evolve. As human beings learn and adapt, we change, and so do our processes. Automated systems, by contrast, stay stuck in time until engineers dive in to change them.
Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit.
Clearly, the free market could not control its excesses. So after journalists like Ida Tarbell and Upton Sinclair exposed these and other problems, the government stepped in.
How do we start to regulate the mathematical models that run more and more of our lives? I would suggest that the process begin with the modelers themselves.
Like doctors, data scientists should pledge a Hippocratic Oath, one that focuses on the possible misuses and misinterpretations of their models. Following the market crash of 2008, two financial engineers, Emanuel Derman and Paul Wilmott, drew up such an oath. It reads: ~ I will remember that I didn’t make the world, and it doesn’t satisfy my equations. ~ Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. ~ I will never sacrifice reality for elegance without explaining why I have done so. ~ Nor will I give the people who use my model false comfort
...more
Today, the success of a model is often measured in terms of profit, efficiency, or default rates.
Yet from society’s perspective, a simple hunt for government services puts a big target on the back of poor people, leading a certain number of them toward false promises and high-interest loans.
The fact that people need food stamps in the first place represents a failing of the market economy.
A regulatory system for WMDs would have to measure such hidden costs, while also incorporating a host of non-numerical values.
Mathematical models should be our tools, not our masters.
So the first step is to get a grip on our techno-utopia, that unbounded and unwarranted hope in what algorithms and technology can accomplish. Before asking them to do better, we have to admit they can’t do everything.
Unlike, say, relief pitchers in baseball, they rarely have great seasons followed by disasters. (And also unlike relief pitchers, their performance resists quantitative analysis.)
There’s no fixing a backward model like the value-added model. The only solution in such a case is to ditch the unfair system.
It is true, as data boosters are quick to point out, that the human brain runs internal models of its own, and they’re often tinged with prejudice or self-interest. So its outputs—in this case, teacher evaluations—must also be audited for fairness.
But wait, many would say. Are we going to sacrifice the accuracy of the model for fairness? Do we have to dumb down our algorithms? In some cases, yes. If we’re going to be equal before the law, or be treated equally as voters, we cannot stand for systems that drop us into different castes and treat us differently.*1
Academic support for these initiatives is crucial. After all, to police the WMDs we need people with the skills to build them.
Auditors face resistance, however, often from the web giants, which are the closest thing we have to information utilities. Google, for example, has prohibited researchers from creating scores of fake profiles in order to map the biases of the search engine.*2 Facebook, too. The social network’s rigorous policy to tie users to their real names severely limits the research outsiders can carry out there.
As we discussed in the chapter on credit scores, the civil rights laws referred to as the Fair Credit Reporting Act (FCRA) and the Equal Credit Opportunity Act (ECOA) were meant to ensure fairness in credit scoring. The FCRA guarantees that a consumer can see the data going into their score and correct any errors,
Each of us should have the right to receive an alert when a credit score is being used to judge or vet us. And each of us should have access to the information being used to compute that score. If it is incorrect, we should have the right to challenge and correct it.
Next, the regulations should expand to cover new types of credit companies, like Lending Club, which use newfangled e-scores to predict the risk that we’ll default on loans. They should not be allowed to operate in the shadows.
The Americans with Disabilities Act (ADA), which protects people with medical issues from being discriminated again...
This highlight has been truncated due to consecutive passage length restrictions.
One possibility already under discussion would extend protection of the ADA to include “predicted” health outcomes down the road.
We must also expand the Health Insurance Portability and Accountability Act (HIPAA), which protects our medical information, in order to cover the medical data currently being collected by employers, health apps, and other Big Data companies.
If we want to bring out the big guns, we might consider moving toward the European model, which stipulates that any data collected must be approved by the user, as an opt-in. It also prohibits the reuse of data for other purposes. The opt-in condition is all too often bypassed by having a user click on an inscrutable legal box.
A certain group of homeless families tended to disappear from shelters and never return. These were the ones who had been granted vouchers under a federal affordable housing program called Section 8. This shouldn’t have been too surprising. If you provide homeless families with affordable housing, not too many of them will opt for the streets or squalid shelters. Yet that conclusion might have been embarrassing to then-mayor Michael Bloomberg and his administration. With much fanfare, the city government had moved to wean families from Section 8.
Meanwhile, New York’s booming real estate market was driving up rents, making the transition even more daunting. Families without Section 8 vouchers streamed back into the shelters.
simple workflow data analysis might highlight five workers who appear to be superfluous. But if the data team brings in an expert, they might help discover a more constructive version of the model.
If we back away from them and treat mathematical models as a neutral and inevitable force, like the weather or the tides, we abdicate our responsibility.