More on this book
Community
Kindle Notes & Highlights
Read between
November 4, 2020 - March 20, 2021
In this book we introduce a collection of the most important fundamental concepts of data science. Some of these concepts are “headliners” for chapters, and others are introduced more naturally through the discussions (and thus they are not necessarily labeled as fundamental concepts). The concepts span the process from envisioning the problem, to applying data science techniques, to deploying the results to improve decision-making. The concepts also undergird a large array of business analytics methods and techniques. The concepts fit into three general types: Concepts about how data science
...more
The past fifteen years have seen extensive investments in business infrastructure, which have improved the ability to collect data throughout the enterprise.
Probably the widest applications of data-mining techniques are in marketing for tasks such as targeted marketing, online advertising, and recommendations for cross-selling.
There is a fundamental structure to data-analytic thinking, and basic principles that should be understood. There are also particular areas where intuition, creativity, common sense, and domain knowledge must be brought to bear.
Economist Erik Brynjolfsson and his colleagues from MIT and Penn’s Wharton School conducted a study of how DDD affects firm performance (Brynjolfsson, Hitt, & Kim, 2011). They developed a measure of DDD that rates firms as to how strongly they use data to make decisions across the company. They show that statistically, the more data-driven a firm is, the more productive it is — even controlling for a wide range of possible confounding factors.
The sort of decisions we will be interested in in this book mainly fall into two types: (1) decisions for which “discoveries” need to be made within data, and (2) decisions that repeat, especially at massive scale, and so decision-making can benefit from even small increases in decision-making accuracy based on data analysis.
A separate study, conducted by economist Prasanna Tambe of NYU’s Stern School, examined the extent to which big data technologies seem to help firms (Tambe, 2012). He finds that, after controlling for various possible confounding factors, using big data technologies is associated with significant additional productivity growth.
data, and the capability to extract useful knowledge from data, should be regarded as key strategic assets.
As with all assets, it is often necessary to make investments.
(Fayyad, Piatetsky-Shapiro, & Smyth, 1996),
Fundamental concept: Extracting useful knowledge from data to solve business problems can be treated systematically by following a process with reasonably well-defined stages.
The Cross Industry Standard Process for Data Mining, abbreviated CRISP-DM (CRISP-DM Project, 2000), is one codification of this process.
Fundamental concept: From a large mass of data, information technology can be used to find informative descriptive attributes of entities of interest.
Fundamental concept: If you look too hard at a set of data, you will find something — but it might not generalize beyond the data you’re looking at.
Fundamental concept: Formulating data mining solutions and evaluating the results involves thinking carefully about the context in which they will be used.