More on this book
Community
Kindle Notes & Highlights
by
Cathy O'Neil
Read between
October 7, 2017 - January 15, 2018
But baseball represents a healthy case study—and it serves as a useful contrast to the toxic models, or WMDs, that are popping up in so many areas of our lives. Baseball models are fair, in part, because they’re transparent. Everyone has access to the stats and can understand more or less how they’re interpreted.
I'm curious if notes put into my Amazon online "notebook" are synced back to either my Kindle, or across to my Goodreads account when added online?
Update: It appears that it certainly works for the online notebook->Goodreads platform.
Moreover, their data is highly relevant to the outcomes they are trying to predict. This may sound obvious, but as we’ll see throughout this book, the folks building WMDs routinely lack data for the behaviors they’re most interested in.
Whatever they learn, they can feed back into the model, refining it. That’s how trustworthy models operate.
The updates and adjustments make it what statisticians call a “dynamic model.”
dole out a certain amount of Pop-Tarts, but only enough to forestall an open rebellion.
I would have turned the food management I keep in my head, my informal internal model, into a formal external one.
We expect it to handle only one job and accept that it will occasionally act like a clueless machine, one with enormous blind spots.
Models are opinions embedded in mathematics.
You can often see troubles when grandparents visit a grandchild they haven’t seen for a while.
Upon meeting her a year later, they can suffer a few awkward hours because their models are out of date.
Racism, at the individual level, can be seen as a predictive model whirring away in billions of human minds around the world. It is built from faulty, incomplete, or generalized data. Whether it comes from experience or hearsay, the data indicates that certain types of people have behaved badly. That generates a binary prediction that all people of that race will behave that same way.
Needless to say, racists don’t spend a lot of time hunting down reliable data to train their twisted models.
the workings of a recidivism model are tucked away in algorithms, intelligible only to a tiny elite.
A 2013 study by the New York Civil Liberties Union found that while black and Latino males between the ages of fourteen and twenty-four made up only 4.7 percent of the city’s population, they accounted for 40.6 percent of the stop-and-frisk checks by police.
So if early “involvement” with the police signals recidivism, poor people and racial minorities look far riskier.
The questionnaire does avoid asking about race, which is illegal. But with the wealth of detail each prisoner provides, that single illegal question is almost superfluous.
We are judged by what we do, not by who we are.
I should add that my model is highly unlikely to scale. I don’t see Walmart or the US Agriculture Department or any other titan embracing my app and imposing it on hundreds of millions of people, like some of the WMDs we’ll be discussing.
The first question: Even if the participant is aware of being modeled, or what the model is used for, is the model opaque, or even invisible?
many companies go out of their way to hide the results of their models or even their existence. One common justification is that the algorithm constitutes a “secret sauce” crucial to their business. It’s intellectual property, and it must be defended,
While many may benefit from it, it leads to suffering for others.
The third question is whether a model has the capacity to grow exponentially. As a statistician would put it, can it scale?
scale is what turns WMDs from local nuisances into tsunami forces, ones that define and delimit our lives.
You could argue, for example, that the recidivism scores are not totally opaque, since they spit out scores that prisoners, in some cases, can see. Yet they’re brimming with mystery, since the prisoners cannot see how their answers produce their score. The scoring algorithm is hidden.
This is similar to anti class action laws and arbitration clauses that prevent classes from realizing they're being discriminated against in the workplace or within healthcare.
the point is not whether some people benefit. It’s that so many suffer.
And here’s one more thing about algorithms: they can leap from one field to the next, and they often do. Research in epidemiology can hold insights for box office predictions; spam filters are being retooled to identify the AIDS virus. This is true of WMDs as well. So if mathematical models in prisons appear to succeed at their job—which really boils down to efficient management of people—they could spread into the rest of the economy along with the other WMDs, leaving us as collateral damage.
Even the smallest patterns can bring in millions to the first investor who unearths them.
The quest for what quants call market inefficiencies is like a treasure hunt.
Since our ideas and algorithms were the foundation of the hedge fund’s business, it was clear that we quants also represented a risk: if we walked away, we could quickly use our knowledge to fuel a fierce competitor.
currency forwards, which were promises to buy large amounts of a foreign currency in a couple of days.
Looking back, you could say the interest rate spikes were actually a sign of sanity, although they obviously came too late.
Hedge funds, after all, didn’t make these markets. They just played in them. That meant that when the market crashed, as it would, rich opportunities would emerge from the wreckage. The game for hedge funds was not so much to ride markets up as to predict the movements within them. Down could be every bit as lucrative.
Their only glimpse of what lurked inside came from analyst ratings. And these analysts collected fees from the very companies whose products they were rating. Mortgage-backed securities, needless to say, were an ideal platform for fraud.
The risk model attached to mortgage-backed securities was a WMD.
The first false assumption was that crack mathematicians in all of these companies were crunching the numbers and ever so carefully balancing the risk.
Even rigorous mathematicians—and there were a few—were working with numbers provided by people carrying out wide-scale fraud.
The second false assumption was that not many people would default at the same time.
These risk models also created their own pernicious feedback loop. The AAA ratings on defective products turned into dollars. The dollars in turn created confidence in the products and in the cheating-and-lying process that manufactured them. The resulting cycle of mutual back-scratching and pocket-filling was how the whole sordid business operated until it blew up.
was forced to confront the ugly truth: people had deliberately wielded formulas to impress rather than clarify.
These new hires are ravenous for success and have been focused on external metrics—like SAT scores and college admissions—their entire lives. Whether in finance or tech, the message they’ve received is that they will be rich, that they will run the world. Their productivity indicates that they’re on the right track, and it translates into dollars. This leads to the fallacious conclusion that whatever they’re doing to bring in more money is good. It “adds value.” Otherwise, why would the market reward it?
In both cultures, wealth is no longer a means to get by. It becomes directly tied to personal worth.
In both of these industries, the real world, with all of its messiness, sits apart. The inclination is to replace people with data trails, turning them into more effective shoppers, voters, or workers to optimize some objective. This is easy to do, and to justify, when success comes back as an anonymous score and when the people affected remain every bit as abstract as the numbers dancing across the screen.
More and more, I worried about the separation between technical models and real people, and about the moral repercussions of that separation.