Tim Moore’s Kindle Notes & Highlights for The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

Rate it:

More on this book

Community

Gordon

1 note & 1 highlight

Oliver

1 note & 1 highlight

Jorge Martínez

10 notes & 72 highlights

Derek Jones

6 notes & 15 highlights

Joshua

12 notes & 33 highlights

NANCY GARCIA

28 notes & 100 highlights

Ibrahim Saud

1 note & 1 highlight

Matthew Yeager

1 note & 64 highlights

Villan

1 note & 2 highlights

Ruth

3 notes & 3 highlights

Beau D Lyddon

1 note & 29 highlights

Nick

1 note & 99 highlights

Doug Lautzenheiser

Arun

Francisco Soto

Harald G.

Roberto Paredes

Ged

Ian Pitchford

Nicholas Patience

Edric Subur

Laimis

Gaurav Rana

Damon R Kost

Rajesh

Moh

Kindle Notes & Highlights

by Tim Moore

See all Tim’s Notes & Highlights

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

by Pedro Domingos

Read between December 16, 2020 - January 2, 2021

The grand aim of science is to cover the greatest number of experimental facts by logical deduction from the smallest number of hypotheses or axioms. —Albert Einstein

Civilization advances by extending the number of important operations we can perform without thinking about them. —Alfred North Whitehead

Computers aren’t supposed to be creative; they’re supposed to do what you tell them to. If what you tell them to do is be creative, you get machine learning. A learning algorithm is like a master craftsman: every one of its productions is different and exquisitely tailored to the customer’s needs.

the world senses what you want and changes accordingly, without you having to lift a finger.

in a virtual reality

they are limited to what we can systematically observe and tractably model. Big data and machine learning greatly expand that scope. Some everyday things can be predicted by the unaided mind, from catching a ball to carrying on a conversation. Some things, try as we might, are just unpredictable. For the vast middle ground between the two, there’s machine learning.

The psychologist Don Norman coined the term conceptual model to refer to the rough knowledge of a technology we need to have in order to use it effectively. This book provides you with a conceptual model of machine learning.

Each of the five tribes of machine learning has its own master algorithm, a general-purpose learner that you can in principle use to discover knowledge from data in any domain. The symbolists’ master algorithm is inverse deduction, the connectionists’ is backpropagation, the evolutionaries’ is genetic programming, the Bayesians’ is Bayesian inference, and the analogizers’ is the support vector machine. In practice, however, each of these algorithms is good for some things but not others. What we really want is a single algorithm combining the key features of all of them:

If it exists, the Master Algorithm can derive all knowledge in the world—past, present, and future—from data. Inventing it would be one of the greatest advances in the history of science. It would speed up the progress of knowledge across the board, and change the world in ways that we can barely begin to imagine. The Master Algorithm is to machine learning what the Standard Model is to particle physics or the Central Dogma to molecular biology: a unified theory that makes sense of everything we know to date, and lays the foundation for decades or centuries of future progress. The Master ...more

If you’re a machine-learning expert, you’re already familiar with much of what the book covers, but you’ll also find in it many fresh ideas, historical nuggets, and useful examples and analogies.

every algorithm, no matter how complex, can be reduced to just these three operations: AND, OR, and NOT.

all humans are mortal, but only 4 percent are Americans.

Machine learning takes many different forms and goes by many different names: pattern recognition, statistical modeling, data mining, knowledge discovery, predictive analytics, data science, adaptive systems, self-organizing systems, and more.

All of the important ideas in machine learning can be expressed math-free.

Naïve Bayes, a learning algorithm that can be expressed as a single short equation.

10%

nearest-neighbor algorithm,

10%

decision tree learners

13%

engineering successes are not proof of scientific validity.

13%

even if they didn’t succeed, they learned many valuable lessons.

17%

Our search for the Master Algorithm is complicated, but also enlivened, by the rival schools of thought that exist within machine learning. The main ones are the symbolists, connectionists, evolutionaries, Bayesians, and analogizers.

17%

For symbolists, all intelligence can be reduced to manipulating symbols, in the same way that a mathematician solves equations by replacing expressions by other expressions. Symbolists understand that you can’t learn from scratch: you need some initial knowledge to go with the data.

17%

Their master algorithm is inverse deduction, which figures out what knowledge is missing in order to make a deduction go through, and then makes it as general as possible. For connectionists, learning is what the brain does, and so what we need to do is reverse engineer it. The brain learns by adjusting the strengths of connections between neurons, and the crucial problem is figuring out which connections are to blame for which errors and changing them accordingly.

18%

Our aim is to touch each part without jumping to conclusions; and once we’ve touched all of them, we will try to picture the whole elephant.

18%

Insight and persistence are what counts.

18%

Are you a rationalist or an empiricist? Rationalists believe that the senses deceive and that logical reasoning is the only sure path to knowledge. Empiricists believe that all reasoning is fallible and that knowledge must come from observation and experimentation.

18%

In computer science, theorists and knowledge engineers are rationalists; hackers and machine learners are empiricists.

18%

The rationalist likes to plan everything in advance before making the first move. The empiricist prefers to try things and see how they turn out.

18%

Descartes, Spinoza, and Leibniz were the leading rationalists;

18%

Locke, Berkeley, and Hume were their empiricist

18%

How can we ever be justified in generalizing from what we’ve seen to what we haven’t? Every learning algorithm is, in a sense, an attempt to answer this question.

19%

Is there any way to learn something from the past that we can be confident will apply in the future? And if there isn’t, isn’t machine learning a hopeless enterprise? For that matter, isn’t all of science, even all of human knowledge, on rather shaky ground?

20%

And therefore, on average over all possible worlds, pairing each world with its antiworld, your learner is equivalent to flipping coins.

20%

We don’t care about all possible worlds, only the one we live in. If we know something about the world and incorporate it into our learner, it now has an advantage over random guessing. To this Hume would reply that that knowledge must itself have come from induction and is therefore fallible. That’s true, even if the knowledge was encoded into our brains by evolution, but it’s a risk we’ll have to take. We can also ask whether there’s a nugget of knowledge so incontestable, so fundamental, that we can build all induction on top of it.

20%

In the meantime, the practical consequence of the “no free lunch” theorem is that there’s no such thing as learning without knowledge. Data alone is not enough. Starting from scratch will only get you to scratch. Machine learning is a kind of knowledge pump: we can use it to extract a lot of knowledge from data, but first we have to prime the pump. Machine learning is what mathematicians call an ill-posed problem: it doesn’t have a unique solution. Here’s a simple ill-posed problem: Which two numbers add up to 1,000? Assuming the numbers are positive, there are five hundred possible answers: 1 ...more

500 possible answers assumes that the numbers are positive integers

20%

Newton’s principle is the first unwritten rule of machine learning. We induce the most widely applicable rules we can and reduce their scope only when the data forces us to.

20%

Newton’s principle is only the first step, however. We still need to figure out what is true of everything we’ve seen—how to extract the regularities from the raw data.

22%

Our beliefs are based on our experience, which gives us a very incomplete picture of the world, and it’s easy to jump to false conclusions.

22%

Overfitting happens when you have too many hypotheses and not enough data to tell them apart.

23%

Harvard’s Leslie Valiant received the Turing Award, the Nobel Prize of computer science, for inventing this type of analysis, which he describes in his book entitled, appropriately enough, Probably Approximately Correct.

26%

Decision trees are used in many different fields. In machine learning, they grew out of work in psychology. Earl Hunt and colleagues used them in the 1960s to model how humans acquire new concepts, and one of Hunt’s graduate students, J. Ross Quinlan, later tried using them for chess. His original goal was to predict the outcome of king-rook versus king-knight endgames from the board positions. From those humble beginnings, decision trees have grown to be, according to surveys, the most widely used machine-learning algorithm. It’s not hard to see why: they’re easy to understand, fast to learn, ...more

26%

The symbolists’ core belief is that all intelligence can be reduced to manipulating symbols. A mathematician solves equations by moving symbols around and replacing symbols by other symbols according to predefined rules.

26%

The psychologist David Marr argued that every information processing system should be studied at three distinct levels: the fundamental properties of the problem it’s solving; the algorithms and representations used to solve it; and how they are physically implemented.

27%

Despite the popularity of decision trees, inverse deduction is the better starting point for the Master Algorithm. It has the crucial property that incorporating knowledge into it is easy—and we know Hume’s problem makes that essential. Also, sets of rules are an exponentially more compact way to represent most concepts than decision trees. Converting a decision tree to a set of rules is easy: each path from the root to a leaf becomes a rule, and there’s no blowup. On the other hand, in the worst case converting a set of rules into a decision tree requires converting each rule into a ...more

27%

Inverse deduction is easily confused by noise:

27%

Most seriously, real concepts can seldom be concisely defined by a set of rules.

27%

They require weighing and accumulating weak evidence until a clear picture emerges. Diagnosing an illness involves giving more weight to some symptoms than others, and being OK with incomplete evidence.

27%

Donald Hebb, a Canadian psychologist, stated it this way in his 1949 book The Organization of Behavior:

28%

Perceptrons were invented in the late 1950s by Frank Rosenblatt, a Cornell psychologist.

28%

In a perceptron, a positive weight represents an excitatory connection, and a negative weight an inhibitory one. The perceptron outputs 1 if the weighted sum of its inputs is above threshold, and 0 if it’s below.

32%

We could do away with the problem of local optima by taking out the S curves and just letting each neuron output the weighted sum of its inputs. That would make the error surface very smooth, leaving only one minimum—the global one. The problem, though, is that a linear function of linear functions is still just a linear function, so a network of linear neurons is no better than a single neuron. A linear brain, no matter how large, is dumber than a roundworm. S curves are a nice halfway house between the dumbness of linear functions and the hardness of step functions.

32%

Backprop was invented in 1986 by David Rumelhart, a psychologist at the University of California, San Diego, with the help of Geoff Hinton and Ronald Williams. Among other things, they showed that backprop can learn XOR,

« Prev 1 2 Next »