Roberto Rigolin F Lopes’s Kindle Notes & Highlights for Data Science

Rate it:

More on this book

Community

Adam Glantz

271 highlights

Martin Goya

1 highlight

Noah

42 highlights

Kindle Notes & Highlights

by Roberto Rigolin F Lopes

See all Roberto’s Notes & Highlights

Data Science

by John D. Kelleher

Read between October 20 - October 20, 2019

15%

As they put it, the key to understanding modern life is “knowing what to measure and how to measure it” (2009, 14).

52%

However, because ML algorithms are biased to look for different types of patterns, and because there is no one learning bias across all situations, there is no one best ML algorithm.

52%

In fact, a theorem known as the “no free lunch theorem” (Wolpert and Macready 1997) states that there is no one best ML algorithm that on average outperforms all other algorithms across all possible data sets.

54%

Finally, the world changes, and models don’t. Implicit in the ML process of data set construction, model training, and model evaluation is the assumption that the future will be the same as the past. This assumption is known as the stationarity assumption: the processes or behaviors that are being modeled are stationary through time (i.e., they don’t change). Data sets are intrinsically historic in the sense that data are representations of observations that were made in the past. So, in effect, ML algorithms search through the past for patterns that might generalize to the future. Obviously, ...more

54%

Data scientists use the term concept drift to describe how a process or behavior can change, or drift, as time passes.

93%

Few, Stephen. 2012. Show Me the Numbers: Designing Tables and Graphs to Enlighten. 2nd ed. Burlingame, CA: Analytics Press.