Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die
Rate it:
Open Preview
9%
Flag icon
Each application of PA is defined by: 1. What’s predicted: The kind of behavior (i.e., action, event, or happening) to predict for each individual, stock, or other kind of element. 2. What’s done about it: The decisions driven by prediction; the action taken by the organization in response to or informed by each prediction.
Walter Adamson
The most fundamental question to ask about big data.
11%
Flag icon
Predictive model—A mechanism that predicts a behavior of an individual, such as click, buy, lie, or die. It takes characteristics of the individual as input, and provides a predictive score as output. The higher the score, the more likely it is that the individual will exhibit the predicted behavior.
17%
Flag icon
As for any domain of PA, the predictive model zips up these various factors into a single score—in this case, a Flight Risk score—for each individual.
19%
Flag icon
PA has taken on an enormous crime wave. It is central to tackling fraud, and promises to bolster street-level policing as well. In these efforts, PA’s power optimizes the assignment of resources. Its predictions dictate how enforcers spend their time—which transactions auditors search for fraud and which street corners cops search for crime.
20%
Flag icon
But when a criminal who would not reoffend is kept in prison because of a false prediction, we will never have the luxury of knowing. There’s a certain finality here, the impossibility of undoing. You can prove innocent a legitimate transaction wrongly flagged as fraudulent, but an incarcerated person has no recourse to disprove unjust assumptions about what his or her future behavior outside prison would have been.
20%
Flag icon
What is new here, despite a general movement toward upgrading decision making with data, is entrusting a machine to contribute to these life-changing decisions for which there can be no accountability.
21%
Flag icon
“Privacy and analytics are often publicly positioned as mortal enemies, but are they really?” asks Ari Schwartz of the U.S. Department of Commerce’s National Institute of Standards and Technology. Indeed, some data hustlers want a free-for-all, while others want to throw the baby out with the bathwater. But Schwartz suggests, “The two worlds may have some real differences, but can probably live a peaceful coexistence if they simply understand where the other is coming from.”
21%
Flag icon
PA is an important, blossoming science. Foretelling your future behavior and revealing your intentions, it’s an extremely powerful tool—and one with significant potential for misuse. It’s got to be managed with extreme care. The agreement we collectively come to for PA’s position in the world is central to the massive cultural shifts we face as we fully enter and embrace the information age.
21%
Flag icon
We are up to our ears in data, but how much can this raw material really tell us? What actually makes it predictive? Does existing data go so far as to reveal the collective mood of the human populace? If yes, how does our emotional online chatter relate to the economy’s ups and downs?
23%
Flag icon
Beyond scientific validation, a tantalizing prospect lingered: stock market prediction. If collective emotion proved to be reflected by subsequent stock movements, the blog mood readings could serve to predict them. This kind of new predictive clue could hold the potential to make millions.
23%
Flag icon
As with most applications of predictive analytics (PA), Eric and Karrie’s system repurposes data. Whatever the intended purpose and target audience of the bloggers, they are delivering a wellspring of raw material that lies dormant, waiting to be reinterpreted by listening to it in a new way that uncovers new meaning and insight. It’s as if data scientists are the intelligent aliens (personality jokes aside) we hoped for, successfully deciphering the human race’s signals.
23%
Flag icon
Repurposing data for PA signifies a mammoth new recycling initiative. Like millions of chicken feet the United States has realized it can sell to China rather than throw away, our phenomenal accumulation of 1’s and 0’s surprises us over and over with newfound applications. Calamari was originally considered junk, as was the basis for white chocolate.
24%
Flag icon
revolution is “the instrumentation of everything.” More and more, each move you make, online and offline, is recorded, including transactions conducted, websites visited, movies watched, links clicked, friends called, opinions posted, dental procedures endured, sports games won (if you’re a professional athlete), traffic cameras passed, flights taken, Wikipedia articles edited, and earthquakes experienced. Countless sensors deploy daily. Mobile devices, robots, and shipping containers record movement, interactions, inventory counts, and radiation levels. Personal health monitors watch your ...more
24%
Flag icon
A new window on the world has opened. Professor Erik Brynjolfsson, an economist at Massachusetts Institute of Technology (MIT), compares this mass instrumentation of human behavior to another historic breakthrough in scientific observation. “The microscope, invented four centuries ago, allowed people to see and measure things as never before—at the cellular level,” said the New York Times, explaining Brynjolfsson’s perspective. “It was a revolution in measurement. Data measurement is the modern equivalent of the microscope.” But rather than viewing things previously too small to see, now we ...more
24%
Flag icon
The next logical question is: what’s the most valuable thing to do with all this stuff? This book’s answer: Learn from it how to predict.
24%
Flag icon
Big data does not exist. The elephant in the room is that there is no elephant in the room. What’s exciting about data isn’t how much of it there is, but how quickly it is growing. We’re in a persistent state of awe at data’s sheer quantity because of one thing that does not change: There’s always so much more today than yesterday. Size is relative, not absolute. If we use the word big today, we’ll quickly run out of adjectives: “big data,” “bigger data,” “even bigger data,” and “biggest data.” The International Conference on Very Large Databases has been running since 1975. We have a dearth ...more
25%
Flag icon
“Big data” is grammatically incorrect. It’s like saying “big water.” Rather, it should be “a lot of data” or “plenty of data.”4 Size doesn’t matter. It’s the rate of expansion.
25%
Flag icon
The answer is simple. Everything is connected to everything else—if only indirectly—and this is reflected in data.
25%
Flag icon
Data always speaks. It always has a story to tell, and there’s always something to learn from it. Data scientists see this over and over again across PA projects.
25%
Flag icon
The Data Effect: Data is always predictive.
25%
Flag icon
This is the assumption behind the leap of faith an organization takes when undertaking PA. Budgeting the staff and tools for a PA project requires this leap, knowing not what specifically will be discovered and yet trusting that something will be. Sitting on an expert panel at Predictive Analytics World, leading UK consultant Tom Khabaza put it this way: “Projects never fail due to lack of patterns.” With The Data Effect in mind, the scientist rests easy. Data is the new oil.
25%
Flag icon
Prediction starts small. PA’s building block is the predictor variable, a single value measured for each individual. For example, recency, the number of weeks since the last time an individual made a purchase, committed a crime, or exhibited a medical symptom, often reveals the chances he or she will do it again in the near term.
25%
Flag icon
Similarly, frequency—the number of times the individual has exhibited the behavior—is also a common, fruitful measure. People who have done something a lot are more likely to do it again.
25%
Flag icon
In fact, it is usually what individuals have done that predicts what they will do. And so PA feeds on data that extends past dry yet essential demographics like location and gender to include behavioral predictors such as recency, frequency, purchases, financial activity, and product usage such as calls and web surfing. These behaviors are often the most valuable—it’s always a behavior that we seek to predict, and indeed behavior predicts behavior. As Jean-Paul Sartre put it, “[A man’s] true self is dictated by his actions.”
25%
Flag icon
Poring over a potpourri of prospective predictors, PA’s aim isn’t only to assess human hunches by testing relationships that seem to make sense, but also to explore a boundless playing field of possible truths beyond the realms of intuition. And so, with The Data Effect in play, PA drops onto your desk connections that seem to defy logic. As strange, mystifying, or unexpected as they may seem, these discoveries help predict.
27%
Flag icon
The dilemma is, as it is often said, correlation does not imply causation.5 The discovery of a predictive relationship between A and B does not mean one causes the other, not even indirectly. No way, nohow.
48%
Flag icon
To this end, the predictive models are trained over 5.7 million examples of a Jeopardy! question paired with a candidate answer. Each example includes 550 predictor variables that summarize the various measures of evidence aggregated for that answer (therefore, the model is made of 550 weights, one per variable). This large amount of training data was formed out of 25,000 Jeopardy! questions.
53%
Flag icon
The uplift score answers the question, “How much more likely is this treatment to generate the desired outcome than the alternative treatment?” It guides an organization’s choice of treatment or action, what to do or say to each individual. The secondary treatment can be the passive action of a control set—for example, make no marketing contact or administer a placebo instead of the trial drug—in which case an uplift model effectively decides whether or not to treat.
55%
Flag icon
Uplift modeling conquers the imperceivable, influence, by newly combining two well-trodden, previously separate paradigms: 1. Comparing treated and control results. 2. Predictive modeling (machine learning, statistical regression, etc.).
55%
Flag icon
The Persuasion Effect: Although imperceivable, the persuasion of an individual can be predicted by uplift modeling, predictively modeling across two distinct training data sets that record, respectively, the outcomes of two competing treatments.