The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Rate it:
2%
Flag icon
“Computers are useless,” said Picasso. “They can only give you answers.”
2%
Flag icon
Homo sapiens is the species that adapts the world to itself instead of adapting itself to the world.
2%
Flag icon
These seemingly magical technologies work because, at its core, machine learning is about prediction: predicting what we want, the results of our actions, how to achieve our goals, how the world will change.
3%
Flag icon
Symbolists view learning as the inverse of deduction and take ideas from philosophy, psychology, and logic. Connectionists reverse engineer the brain and are inspired by neuroscience and physics. Evolutionaries simulate evolution on the computer and draw on genetics and evolutionary biology. Bayesians believe learning is a form of probabilistic inference and have their roots in statistics. Analogizers learn by extrapolating from similarity judgments and are influenced by psychology and mathematical optimization. Driven by the goal of building learning machines, we’ll tour a good chunk of the ...more
4%
Flag icon
The old, tried-and-true statistical tools don’t get you far in the age of big (or even medium) data. You need machine learning’s nonlinear chops to accurately model most phenomena, and it brings with it a new scientific worldview.
4%
Flag icon
An algorithm is a sequence of instructions telling a computer what to do.
4%
Flag icon
Claude Shannon, better known as the father of information theory, was the first to realize that what transistors are doing, as they switch on and off in response to other transistors, is reasoning. (That was his master’s thesis at MIT—the most important master’s thesis of all time.)
4%
Flag icon
Believe it or not, every algorithm, no matter how complex, can be reduced to just these three operations: AND, OR, and NOT.
5%
Flag icon
In any area of science, if a theory cannot be expressed as an algorithm, it’s not entirely rigorous. (Not to mention you can’t use a computer to solve it, which really limits what you can do with it.) Scientists make theories, and engineers make devices. Computer scientists make algorithms, which are both theories and devices.
5%
Flag icon
A programmer—someone who creates algorithms and codes them up—is a minor god, creating universes at will.
5%
Flag icon
Like the Hydra, the complexity monster has many heads. One of them is space complexity: the number of bits of information an algorithm needs to store in the computer’s memory. If the algorithm needs more memory than the computer can provide, it’s useless and must be discarded. Then there’s the evil sister, time complexity: how long the algorithm takes to run, that is, how many steps of using and reusing the transistors it has to go through before it produces the desired results. If it’s longer than we can wait, the algorithm is again useless. But the scariest face of the complexity monster is ...more
5%
Flag icon
Every algorithm has an input and an output: the data goes into the computer, the algorithm does what it will with it, and out comes the result. Machine learning turns this around: in goes the data and the desired result and out comes the algorithm that turns one into the other. Learning algorithms—also known as learners—are algorithms that make other algorithms. With machine learning, computers write their own programs, so we don’t have to.
5%
Flag icon
In farming, we plant the seeds, make sure they have enough water and nutrients, and reap the grown crops. Why can’t technology be more like this? It can, and that’s the promise of machine learning. Learning algorithms are the seeds, data is the soil, and the learned programs are the grown plants. The machine-learning expert is like a farmer, sowing the seeds, irrigating and fertilizing the soil, and keeping an eye on the health of the crop but otherwise staying out of the way.
5%
Flag icon
We can think of machine learning as the inverse of programming, in the same way that the square root is the inverse of the square, or integration is the inverse of differentiation. Just as we can ask “What number squared gives 16?” or “What is the function whose derivative is x + 1?” we can ask, “What is the algorithm that produces this output?”
6%
Flag icon
In machine learning, knowledge is often in the form of statistical models, because most knowledge is statistical: all humans are mortal, but only 4 percent are Americans. Skills are often in the form of procedures: if the road curves left, turn the wheel left; if a deer jumps in front of you, slam on the brakes. (Unfortunately, as of this writing Google’s self-driving cars still confuse windblown plastic bags with deer.) Often, the procedures are quite simple, and it’s the knowledge at their core that’s complex. If you can tell which e-mails are spam, you know which ones to delete. If you can ...more
This highlight has been truncated due to consecutive passage length restrictions.
6%
Flag icon
Many computer scientists, particularly those of an older generation, don’t understand machine learning as well as they’d like to. This is because computer science has traditionally been all about thinking deterministically, but machine learning requires thinking statistically.
6%
Flag icon
The Industrial Revolution automated manual work and the Information Revolution did the same for mental work, but machine learning automates automation itself. Without it, programmers become the bottleneck holding up progress. With it, the pace of progress picks up. If you’re a lazy and not-too-bright computer scientist, machine learning is the ideal occupation, because learning algorithms do all the work but let you take all the credit. On the other hand, learning algorithms could put us out of our jobs, which would only be poetic justice.
6%
Flag icon
As companies grow, they go through three phases. First, they do everything manually: the owners of a mom-and-pop store personally know their customers, and they order, display, and recommend items accordingly. This is nice, but it doesn’t scale. In the second and least happy phase, the company grows large enough that it needs to use computers. In come the programmers, consultants, and database managers, and millions of lines of code get written to automate all the functions of the company that can be automated. Many more people are served, but not as well: decisions are made based on coarse ...more
7%
Flag icon
The best way for a company to ensure that learners like its products is to run them itself. Whoever has the best algorithms and the most data wins. A new type of network effect takes hold: whoever has the most customers accumulates the most data, learns the best models, wins the most new customers, and so on in a virtuous circle (or a vicious one, if you’re the competition).
7%
Flag icon
In the same way that a bank without databases can’t compete with a bank that has them, a company without machine learning can’t keep up with one that uses it. While the first company’s experts write a thousand rules to predict what its customers want, the second company’s algorithms learn billions of rules, a whole set of them for each individual customer. It’s about as fair as spears against machine guns. Machine learning is a cool new technology, but that’s not why businesses embrace it. They embrace it because they have no choice.
7%
Flag icon
Big data is no use if you can’t turn it into knowledge, however, and there aren’t enough scientists in the world for the task.
8%
Flag icon
If computers hadn’t been invented, science would have ground to a halt in the second half of the twentieth century. This might not have been immediately apparent to the scientists because they would have been focused on whatever limited progress they could still make, but the ceiling for that progress would have been much, much lower. Similarly, without machine learning, many sciences would face diminishing returns in the decades to come.
9%
Flag icon
If cyberwar ever comes to pass, the generals will be human, but the foot soldiers will be algorithms. Humans are too slow and too few and would be quickly swamped by an army of bots. We need our own bot army, and machine learning is like West Point for bots.
9%
Flag icon
The solution is to marry machine learning with game theory, something I’ve worked on in the past: don’t just learn to defeat what your opponent does now; learn to parry what he might do against your learner.
9%
Flag icon
In fact, just a few algorithms are responsible for the great majority of machine-learning applications, and we’ll take a look at them in the next few chapters. For example, consider Naïve Bayes, a learning algorithm that can be expressed as a single short equation. Given a database of patient records—their symptoms, test results, and whether or not they had some particular condition—Naïve Bayes can learn to diagnose the condition in a fraction of a second, often better than doctors who spent many years in medical school. It can also beat medical expert systems that took thousands of ...more
10%
Flag icon
If so few learners can do so much, the logical question is: Could one learner do everything? In other words, could a single algorithm learn all that can be learned from data? This is a very tall order, since it would ultimately include everything in an adult’s brain, everything evolution has created, and the sum total of all scientific knowledge. But in fact all the major learners—including nearest-neighbor, decision trees, and Bayesian networks, a generalization of Naïve Bayes—are universal in the following sense: if you give the learner enough of the appropriate data, it can approximate any ...more
10%
Flag icon
Here, then, is the central hypothesis of this book: All knowledge—past, present, and future—can be derived from data by a single, universal learning algorithm. I call this learner the Master Algorithm. If such an algorithm is possible, inventing it would be one of the greatest scientific achievements of all time. In fact, the Master Algorithm is the last thing we’ll ever have to invent because, once we let it loose, it will go on to invent everything else that can be invented. All we need to do is provide it with enough of the right kind of data, and it will discover the corresponding ...more
11%
Flag icon
The most important argument for the brain being the Master Algorithm, however, is that it’s responsible for everything we can perceive and imagine. If something exists but the brain can’t learn it, we don’t know it exists. We may just not see it or think it’s random. Either way, if we implement the brain in a computer, that algorithm can learn everything we can. Thus one route—arguably the most popular one—to inventing the Master Algorithm is to reverse engineer the brain. Jeff Hawkins took a stab at this in his book On Intelligence. Ray Kurzweil pins his hopes for the Singularity—the rise of ...more
11%
Flag icon
Life’s infinite variety is the result of a single mechanism: natural selection. Even more remarkable, this mechanism is of a type very familiar to computer scientists: iterative search, where we solve a problem by trying many candidate solutions, selecting and modifying the best ones, and repeating these steps as many times as necessary. Evolution is an algorithm.
11%
Flag icon
Another line of evidence comes from optimization, the branch of mathematics concerned with finding the input to a function that produces its highest output.
11%
Flag icon
In his book Consilience, the distinguished biologist E. O. Wilson makes an impassioned argument for the unity of all knowledge, from science to the humanities.
12%
Flag icon
According to one school of statisticians, a single simple formula underlies all learning. Bayes’ theorem, as the formula is known, tells you how to update your beliefs whenever you see new evidence. A Bayesian learner starts with a set of hypotheses about the world. When it sees a new piece of data, the hypotheses that are compatible with it become more likely, and the hypotheses that aren’t become less likely (or even impossible). After seeing enough data, a single hypothesis dominates, or a few do. For example, if I’m looking for a program that accurately predicts stock movements and a stock ...more
12%
Flag icon
If you can solve Tetris, you can solve thousands of the hardest and most important problems in science, technology, and management—all in one fell swoop. That’s because at heart they are all the same problem. This is one of the most astonishing facts in all of science.
12%
Flag icon
these are all NP-complete problems, meaning that if you can efficiently solve one of them you can efficiently solve all problems in the class NP, including each other.
12%
Flag icon
P and NP are the two most important classes of problems in computer science. (The names are not very mnemonic, unfortunately.) A problem is in P if we can solve it efficiently, and it’s in NP if we can efficiently check its solution. The famous P = NP question is whether every efficiently checkable problem is also efficiently solvable. Because of NP-completeness, all it takes to answer it is to prove that one NP-complete problem is efficiently solvable (or not). NP is not the hardest class of problems in computer science, but it’s arguably the hardest “realistic” class: if you can’t even check ...more
12%
Flag icon
Often, we do this by reducing them to satisfiability, the canonical NP-complete problem: Can a given logical formula ever be true, or is it self-contradictory? If we invent a learner that can learn to solve satisfiability, it has a good claim to being the Master Algorithm.
12%
Flag icon
The Master Algorithm is for induction, the process of learning, what the Turing machine is for deduction.
12%
Flag icon
The most determined resistance comes from machine learning’s perennial foe: knowledge engineering. According to its proponents, knowledge can’t be learned automatically; it must be programmed into the computer by human experts. Sure, learners can extract some things from data, but nothing you’d confuse with real knowledge. To knowledge engineers, big data is not the new oil; it’s the new snake oil.
13%
Flag icon
Despite machine learning’s successes, the knowledge engineers remain unconvinced. They believe that its limitations will soon become apparent, and the pendulum will swing back. Marvin Minsky, an MIT professor and AI pioneer, is a prominent member of this camp. Minsky is not just skeptical of machine learning as an alternative to knowledge engineering, he’s skeptical of any unifying ideas in AI. Minsky’s theory of intelligence, as expressed in his book The Society of Mind, could be unkindly characterized as “the mind is just one damn thing after another.” The Society of Mind is a laundry list ...more
13%
Flag icon
Knowledge engineers believe AI is just an engineering problem, but we have not yet reached the point where engineering can take us the rest of the way. In 1962, when Kennedy gave his famous moon-shot speech, going to the moon was an engineering problem. In 1662, it wasn’t, and that’s closer to where AI is today.
13%
Flag icon
Another prominent machine-learning skeptic is the linguist Noam Chomsky. Chomsky believes that language must be innate, because the examples of grammatical sentences children hear are not enough to learn a grammar. This only puts the burden of learning language on evolution, however; it does not argue against the Master Algorithm but only against it being something like the brain. Moreover, if a universal grammar exists (as Chomsky believes), elucidating it is a step toward elucidating the Master Algorithm. The only way this is not the case is if language has nothing in common with other ...more
13%
Flag icon
Statistical language learners work, and hand-engineered language systems don’t.
13%
Flag icon
Another potential source of objections to the Master Algorithm is the notion, popularized by the psychologist Jerry Fodor, that the mind is composed of a set of modules with only limited communication between them.
13%
Flag icon
Critics like Minsky, Chomsky, and Fodor once had the upper hand, but thankfully their influence has waned. Nevertheless, we should keep their criticisms in mind as we set out on the road to the Master Algorithm for two reasons. The first is that knowledge engineers faced many of the same problems machine learners do, and even if they didn’t succeed, they learned many valuable lessons. The second is that learning and knowledge are intertwined in surprisingly subtle ways, as we’ll soon find out. Unfortunately, the two camps often talk past each other. They speak different languages: machine ...more
13%
Flag icon
Outside of AI and cognitive science, the most common objections to machine learning are variants of this claim. Nassim Taleb hammered on it forcefully in his book The Black Swan. Some events are simply not predictable. If you’ve only ever seen white swans, you think the probability of ever seeing a black one is zero.
14%
Flag icon
Learning algorithms are quite capable of accurately predicting rare, never-before-seen events; you could even say that that’s what machine learning is all about. What’s the probability of a black swan if you’ve never seen one?
14%
Flag icon
A related, frequently heard objection is “Data can’t replace human intuition.” In fact, it’s the other way around: human intuition can’t replace data. Intuition is what you use when you don’t know the facts, and since you often don’t, intuition is precious. But when the evidence is before you, why would you deny it?
14%
Flag icon
There’s a saying in industry: “Listen to your customers, not to the HiPPO,” HiPPO being short for “highest paid person’s opinion.” If you want to be tomorrow’s authority, ride the data, don’t fight it.
14%
Flag icon
Science goes through three phases, which we can call the Brahe, Kepler, and Newton phases. In the Brahe phase, we gather lots of data, like Tycho Brahe patiently recording the positions of the planets night after night, year after year. In the Kepler phase, we fit empirical laws to the data, like Kepler did to the planets’ motions. In the Newton phase, we discover the deeper truths. Most science consists of Brahe- and Kepler-like work; Newton moments are rare. Today, big data does the work of billions of Brahes, and machine learning the work of millions of Keplers. If—let’s hope so—there are ...more
14%
Flag icon
In a famous passage of his book The Sciences of the Artificial, AI pioneer and Nobel laureate Herbert Simon asked us to consider an ant laboriously making its way home across a beach. The ant’s path is complex, not because the ant itself is complex but because the environment is full of dunelets to climb and pebbles to get around. If we tried to model the ant by programming in every possible path, we’d be doomed. Similarly, in machine learning the complexity is in the data; all the Master Algorithm has to do is assimilate it, so we shouldn’t be surprised if it turns out to be simple. The human ...more
« Prev 1 3 4