More on this book
Community
Kindle Notes & Highlights
by
Eric Siegel
Read between
February 4 - February 15, 2022
Siegel also resists the blandishments of the “big data” movement. Certainly some of the examples he mentions fall into this category—data that is too large or unstructured to be easily managed by conventional relational databases. But the point of predictive analytics is not the relative size or unruliness of your data, but what you do with it. I have found that “big data often equals small math,”
PA is like Moneyball for…money.
As data piles up, we have ourselves a genuine gold rush. But data isn't the gold. I repeat, data in its raw form is boring crud. The gold is what's discovered therein.
Hewlett-Packard (HP) earmarks each and every one of its more than 300,000 worldwide employees according to “Flight Risk,” the expected chance he or she will quit their job, so that managers may intervene in advance where possible and plan accordingly otherwise.
Wikipedia predicts which of its editors, who work for free as a labor of love to keep this priceless online asset alive, are going to discontinue their valuable service.
Inspired by the TV crime drama Lie to Me about a microexpression reader, researchers at the University at Buffalo trained a system to detect lies with 82 percent accuracy by observing eye movements alone.
A breakthrough in machine learning would be worth 10 Microsofts. —Bill Gates
The first step toward predicting the future is admitting you can't. —Stephen Dubner, Freakonomics Radio, March 30, 2011
The “prediction paradox”: The more humility we have about our ability to make predictions, the more successful we can be in planning for the future. —Nate Silver, The Signal and the Noise: Why So ManyPredictions Fail—but Some Don't
In the insurance business, one company reports that PA saves almost $50 million annually by decreasing its loss ratio by half a percentage point.
Before each launch, organizations establish confidence in PA by “predicting the past” (aka backtesting). The predictive model must prove itself on historical data before its deployment. Conducting a kind of simulated prediction, the model evaluates across data from last week, last month, or last year.
Alexis Madrigal, senior editor at The Atlantic, points out that a user's data can be purchased for about half a cent, but the average user's value to the Internet advertising ecosystem is estimated at $1,200 per year.
A score produced by any predictive model must be taken with a very particular grain of salt. Scores speak to trends and probabilities across a large group; one individual probability by its nature oversimplifies the real-world thing it describes. If I were to miss a single credit card payment, the probability that I'd miss another in the same year may quadruple, based on that factor alone.
Oregon launched a crime prediction tool to be consulted by judges when sentencing convicted felons. The tool is on display for anyone to try out. If you know the convict's state ID and the crime for which he or she is being sentenced, you can enter the information on the Oregon Criminal Justice Commission's public website and see the predictive model's output: the probability the offender will be convicted again for a felony within three years of being released.
A joint study by Columbia University and Ben Gurion University (Israel) showed that hungry judges rule negatively. Judicial parole decisions immediately after a food break are about 65 percent favorable, but then drop gradually to almost zero percent before the next break. If your parole board judges are hungry, you're much more likely to stay in prison.
Even Ellen Kurtz, who champions the adoption of the crime model in Philadelphia, admits, “If you wanted to remove everything correlated with race, you couldn't use anything. That's the reality of life in America.”
As with pregnancy, predictive models can also ascertain minority status—from behavior online, where divulging demographics would otherwise come only at the user's discretion. A study from the University of Cambridge shows that race, age, sexual orientation, and political orientation can be determined with high levels of accuracy based on one's Facebook likes. This capability could grant marketers and other researchers access to unvolunteered demographic information.
The growth is exponential. Data more than doubles every three years. This brought us to an estimated 8 zettabytes in 2015—that's 8,000,000,000,000,000,000,000 (21 zeros) bytes. Welcome to Big Bang 2.0. The next logical question is: What's the most valuable thing to do with all this stuff? This book's answer: Learn from it how to predict.
Female-named hurricanes are more deadly. Based on a study of the most damaging hurricanes in the United States during six recent decades, the ones with “relatively feminine” names killed an average of 42 people, almost three times the 15 killed by hurricanes with “relatively male” names. University researchers This may result from “a hazardous form of implicit sexism.” Psychological experiments in a related study “suggested that this is because feminine- versus masculine-named hurricanes are perceived as less risky and thus motivate less preparedness.…Individuals systematically underestimate
...more
The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt. —Bertrand Russell
Probability is relative, affected entirely by context. With additional background information, a seemingly unlikely event turns out to be not so special after all.
Imagine we test 70 characteristics of cars that in reality are not predictive of lemons. But each test suffers a, say, 1 percent risk the data will falsely show a predictive effect just by random chance. The accumulated risk piles up. As with the jackpot wheel, there's a 50/50 chance the unlikely event will eventually take place—that you will stumble upon a random perturbation that, considered in isolation, is compelling enough to mislead.
Automating Science: Vast Search The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka!” but rather “Hmm…that's funny…” —Isaac Asimov
Most discussions of decision making assume that only senior executives make decisions or that only senior executives' decisions matter. This is a dangerous mistake. —Peter Drucker, an American educator and writer born in 1909
Dan's learning system made a discovery within Chase's data: If a mortgage's interest rate is under 7.94 percent, then the risk of prepayment is 3.8 percent; otherwise, the risk is 19.2 percent.5
Divide and conquer and then divide some more, breaking down to smaller and smaller groups. And yet, as we'll discover, don't go too far. This learning method, called decision trees, isn't the only way to create a predictive model, but it's consistently voted as the most or second most popular by practitioners, due to its balance of relative simplicity with effectiveness.
you torture the data long enough, it will confess. —Ronald Coase, Professor of Economics, University of Chicago There are three kinds of lies: lies, damned lies, and
The culprit that kills learning is overlearning (aka overfitting). Overlearning is the pitfall of mistaking noise for information, assuming too much about what has been shown within data. You've overlearned if you've read too much into the numbers, led astray from discovering the underlying truth.
Among the competing approaches to machine learning, decision trees are often considered the most user friendly, since they consist of rules you can read like a long (if cumbersome) English sentence, while other methods are more mathy, taking the variables and plugging them into equations.
People close to the project at Chase reported that the predictive models generated millions of dollars of additional profit during the first year of deployment. The models correctly identified 74 percent of mortgage prepayments before they took place, and drove the management of mortgage portfolios successfully.15
established recommendation capabilities by 10 percent. Netflix is a prime example of PA in action, as a reported 70 percent of Netflix movie choices arise from its online recommendations. Product recommendations
out. If we assume that people guess too high as much as they do too low, averaging cancels out these errors in judgment.
Ensemble modeling has taken the PA industry by storm. It's often considered the most important predictive modeling advancement to come to fruition in the first decade of this century.
Research results consistently show that ensembles boost a single model's performance in the general range of 5 to 30 percent, and that integrating more models into an ensemble often continues to improve it further.
Programming a computer to work adeptly with human language is often considered the ultimate challenge of artificial intelligence (AI).
had a ball.” Great, you had fun. “I had a ball but I lost it.” Not so much fun! But in a certain context, the same phrase goes back to being about having a blast:
We are not scanning all those books to be read by people. We are scanning them to be read by an AI. —A Google employee regarding Google's book scanning, as quoted by George Dyson in Turing's Cathedral: The Origins of the Digital Universe

