More on this book
Community
Kindle Notes & Highlights
“It’s really a no-brainer,” he said. “Wine is an agricultural product dramatically affected by the weather from year to year.” Using decades of weather data from France’s Bordeaux region, Orley found that low levels of harvest rain and high average summer temperatures produce the greatest wines. As Peter Passell reported in the New York Times, the statistical equation fit the data remarkably well.
Wine quality = 12.145 + 0.00117 winter rainfall + 0.0614 average growing season temperature – 0.00386 harvest rainfall
As Upton Sinclair (and now Al Gore) has said, “It is difficult to get a man to understand something when his salary depends on his not understanding it.
Bill James did for baseball. In his annual Baseball Abstracts, James challenged the notion that baseball experts could judge talent simply by watching a player. Michael Lewis’s Moneyball showed that James was baseball’s herald of data-driven decision making. James’s simple but powerful thesis was that data-based analysis in baseball was superior to observational expertise:
There are striking parallels between the ways that Ashenfelter and James originally tried to disseminate their number-crunching results. Just like Ashenfelter, James began by placing small ads for his first newsletter, Baseball Abstracts (which he euphemistically characterized as a book). In the first year, he sold a total of seventy-five copies. Just as Ashenfelter was locked out of Wine Spectator, James was given the cold shoulder by the Elias Sports Bureau when he asked to share data.
We are in a historic moment of horse-versus-locomotive competition, where intuitive and experiential expertise is losing out time and time again to number crunching. In the old days, many decisions were simply based on some mixture of experience and intuition.
What is Super Crunching? It is statistical analysis that impacts real-world decisions.
In field after field, “intuitivists” and traditional experts are battling Super Crunchers. In medicine, a raging controversy over what is called “evidence-based medicine” boils down to a question of whether treatment choice will be based on statistical analysis or not.
Steven D. Levitt and Stephen J. Dubner showed in Freakonomics dozens of examples of how statistical analysis of databases can reveal the secret levers of causation.
All that changed when they saw the first draft of our paper. After looking at auto theft in fifty-six cities over fourteen years, we found that LoJack had a huge positive benefit for other people. In high-crime areas, a $500 investment in LoJack reduced the car theft losses of non-LoJack users by
Neil Clark Warren, eHarmony’s founder and driving force, studied more than 5,000 married people in the late 1990s. Warren patented a predictive statistical model of compatibility based on twenty-nine different variables related to a person’s emotional temperament, social style, cognitive mode, and relationship skills.
A regression is a statistical procedure that takes raw historical data and estimates how various causal factors influence a single variable of interest.
The regression technique was developed more than 100 years ago by Francis Galton, a cousin of Charles Darwin. Galton estimated the first regression line way back in 1877. Remember Orley Ashenfelter’s simple equation to predict the quality of wine? That equation came from a regression.
Galton called “regression toward mediocrity”—and what we now call “regression toward the mean.
This is the wisdom of crowds that goes beyond the conscious choices of individual members to see what works at unconscious, hidden levels.
Barbara Ehrenreich was appalled when she took an employment test at a Minneapolis Wal-Mart and was told that she had given the wrong answer when she agreed with the proposition “there is room in every corporation for a non-conformist.
I’m told that Visa already does predict the probability of divorce based on credit card purchases (so that it can make better predictions of default risk).
The statistical regression not only produces a prediction, it also simultaneously reports how precisely it was able to predict. That’s right—a regression tells you how accurate the prediction is.
USA Today reported that the National Security Agency has been amassing a database with the records of two trillion telephone calls since 2001.
In 1925, Ronald Fisher, the father of modern statistics, formally proposed using random assignments to test whether particular medical interventions had some predicted effect. The first randomized trial on humans (of an early antibiotic against tuberculosis) didn’t take place until the late 1940s. But now, with the encouragement of the Food and Drug Administration, randomized tests have become the gold standard for proving whether or not medical treatments are efficacious.
Indeed, Google even will start by rotating your ads and then automatically shift toward the ad that has the higher click-through rate.
Most importantly from the government’s perspective, the programs more than paid for themselves. The reduction in UI benefits paid plus the increase in tax receipts from faster reemployment were more than enough to pay for the cost of providing the search assistance. For every dollar invested in job assistance, the government saved about two dollars.
Waldfogel’s “a-ha” moment was simply that random judicial assignment would allow him to rank judges on their sentencing proclivity. If judges within a district were seeing the same kinds of cases, then intra-district disparities in criminal sentencing had to be attributable to differences in judicial temperament.
So what’s the answer? Well, the best evidence is that neither side in the debate is right. Putting people in jail neither increases nor decreases the probability that they’ll commit a crime when they’re released.
The spread of randomized testing is also due to the hard work of the Poverty Action Lab. Founded at MIT in 2003 by Abhijit Banerjee, Esther Duflo, and Sendhil Mullainathan, the Poverty Action Lab is devoted to using randomized trials to test what development strategies actually work.
Today, every study is given a grade (on a fifteen-category scale developed by the Oxford Center for Evidence-Based Medicine) as a shorthand way to inform the reader of the quality of evidence. The highest possible grade (“1 a”) is only awarded when there are multiple randomized trials with similar results, while the lowest grade goes to suggested treatments that are based solely on expert opinion.
The success of evidence-based medicine is the rise of data-based decision making par excellence. It is decision making based not on intuition or personal experience, but on systematic statistical studies.
Researchers have found that about 10 percent of the time, Isabel helps doctors include a major diagnosis that they would not have considered but should have. Isabel is constantly putting itself to the test. Every week the New England Journal of Medicine includes a diagnostic puzzler in its pages. Simply cutting and pasting the patient’s case history into the input section allows Isabel to produce a list of ten to thirty diagnoses.
Andrew Martin and Kevin Quinn. Martin and Quinn were presenting a paper claiming that, by using just a few variables concerning the politics of the case, they could predict how Supreme Court justices would vote.
Way back in 1954, Paul Meehl wrote a book called Clinical Versus Statistical Prediction. This slim volume created a storm of controversy among psychologists because it reported the results of about twenty other empirical studies that compared how well “clinical” experts could predict relative to simple statistical models. The studies concerned a diverse set of predictions, such as how patients with schizophrenia would respond to electroshock therapy or how prisoners would respond to parole. Meehl’s startling finding was that none of the studies suggested that experts could outpredict
...more
As long as you have a large enough dataset, almost any decision can be crunched.
In fact, it’s possible to test your own ability to make unbiased estimates. For each of the following ten questions, give the range of answers that you are 90 percent confident contains the correct answer.
If all ten of your intervals include the correct answer, you’re under-confident.
People think they know more than they actually know.
Less than 1 percent of the people gave ranges that included the right answer nine or ten times. Ninety-nine percent of people were overconfident.
“Human judges are not merely worse than optimal regression equations; they are worse than almost any regression equation.
Indeed, this difference is at the heart of the shift to evidence-based medical guidelines. The traditional expert-generated guidelines just gave undifferentiated pronouncements of what physicians should and should not do. Evidence-based guidelines, for the first time, explicitly tell physicians the quality of the evidence underlying each suggested practice. Signaling the quality of the evidence lets physicians (and patients) know when a guideline is written in stone and when it’s just their best guess given limited information.
Instead of having the statistics as a servant to expert choice, the expert becomes a servant of the statistical machine. Mark E. Nissen, a professor at the Naval Postgraduate School in Monterey, California, who has tested computer-versus-human procurement, sees a fundamental shift toward systems where the traditional expert is stripped of his or her power to make the final
The Rapid Risk Assessment for Sexual Offender Recidivism (RRASOR) is a point system based on a regression analysis of male offenders in Canada.
The problem is that these discretionary escape hatches have costs too. “People see broken legs everywhere,” Snijders says, “even when they are not there.” The Mercury astronauts insisted on a literal escape hatch. They balked at the idea of being bolted inside a capsule that could only be opened from the outside.
System builders must carefully consider the costs as well as the benefits of delegating discretion. In context after context, decision makers who wave off the statistical predictions tend to make poorer decisions.
In a word, hypothesize. The most important thing that is left to humans is to use our minds and our intuition to guess at what variables should and should not be included in statistical analysis.
In the new world of database decision making, these assessments are merely inputs for a formula, and it is statistics, and not experts, that determine how much weight is placed on the assessments.
Universities are loath to accept that a computer could select better students. Book publishers would be loath to delegate the final say in acquiring manuscripts to an algorithm.
The rise of Acxiom shows how commercialization has increased the fluidity of information across organizations. Some large retailers like Amazon.com and Wal-Mart simply sell aggregate customer transaction information. Want to know how well Crest toothpaste sells if it’s placed higher on the shelf? Target will sell you the answer. But Acxiom also allows vendors to trade information. By providing Acxiom’s transaction information about its individual customers, a retailer can gain access to a data warehouse of staggering proportions.
A “data commons” movement has created websites for people to post and link their data with others. In the last ten years, the norm of sharing datasets has become increasingly grounded in academics. The premier economics journal in the United States, the American Economic Review, requires that researchers post to a centralized website all the data backing up their empirical articles. So many researchers are posting their datasets to their personal web pages that it is now more likely than not that you can download the data for just about any empirical article by just typing a few words into
...more
The studio head was bragging about the results of a paradigm-shifting experiment in which Epagogix was asked to predict the gross revenues of nine motion pictures just based on their scripts—before the stars or the directors had even been chosen. What made the CEO so excited was that the neural equations had been able to accurately predict the profitability of six out of nine films. On a number of the films, the formula’s revenue prediction was within a few million dollars of the actual gross.
If you can’t measure what you’re trying to maximize, you’re not going to be able to rely on data-driven decisions.
The DI approach is the brainchild of Siegfried “Zig” Engelmann, who started studying how best to teach reading at the University of Illinois in the 1960s.
Direct Instruction won hands down. Education writer Richard Nadler summed it up this way: “When the testing was over, students in DI classrooms had placed first in reading, first in math, first in spelling, and first in language. No other model came close.