The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Rate it:
Kindle Notes & Highlights
32%
Flag icon
In an early demonstration of the power of backprop, Terry Sejnowski and Charles Rosenberg trained a multilayer perceptron to read aloud. Their NETtalk system scanned the text, selected the correct phonemes according to context, and fed them to a speech synthesizer.
32%
Flag icon
(You can find samples on YouTube by typing “sejnowski nettalk.”)
33%
Flag icon
sparse autoencoder,
33%
Flag icon
The next clever idea is to stack sparse autoencoders on top of each other like a club sandwich. The hidden layer of the first autoencoder becomes the input/output layer of the second one, and so on. Because the neurons are nonlinear, each hidden layer learns a more sophisticated representation of the input, building on the previous one.
33%
Flag icon
On Intelligence, Jeff Hawkins advocated designing algorithms closely based on the organization of the cortex, but so far none of these algorithms can compete with today’s deep networks.
35%
Flag icon
capturing natural selection by a set of equations is extremely difficult, but expressing it as an algorithm is another matter, and can shed light on many otherwise vexing questions.
39%
Flag icon
Daniel Kahneman illustrates at length in his book Thinking, Fast and Slow.
40%
Flag icon
P(cause | effect) = P(cause) × P(effect | cause) / P(effect).
41%
Flag icon
machine learning is the art of making false assumptions and getting away with
Tim Moore
... it.
43%
Flag icon
Everything is connected, but not directly
45%
Flag icon
P(hypothesis | data) = P(hypothesis) × P(data | hypothesis) / P(data)
45%
Flag icon
The hypothesis can be as complex as a whole Bayesian network, or as simple as the probability that a coin will come up heads. In the latter case, the data is just the outcome of a series of coin flips.
48%
Flag icon
The nearest-neighbor algorithm,
48%
Flag icon
support vector machines,
48%
Flag icon
analogical reasoning,
63%
Flag icon
The metalearner can itself be any learner, from a decision tree to a simple weighted vote. To learn the weights, or the decision tree, we replace the attributes of each original example by the learners’ predictions. Learners that often predict the correct class will get high weights, and inaccurate ones will tend to be ignored. With a decision tree, the choice of whether to use a learner can be contingent on other learners’ predictions. Either way, to obtain a learner’s prediction for a given training example, we must first apply it to the original training set excluding that example and use ...more
63%
Flag icon
This type of metalearning is called stacking and is the brainchild of David Wolpert, whom we met in Chapter 3 as the author of the “no free lunch” theorem. An even simpler metalearner is bagging, invented by the statistician Leo Breiman. Bagging generates random variations of the training set by resampling, applies the same learner to each one, and combines the results by voting. The reason to do this is that it reduces variance: the combined model is much less sensitive to the vagaries of the data than any single one, making this a remarkably easy way to improve accuracy. If the models are ...more
63%
Flag icon
One of the cleverest metalearners is boosting, created by two learning theorists, Yoav Freund and Rob Schapire.
63%
Flag icon
boosting repeatedly applies the same classifier to the data, using each new model to correct the previous ones’ mistakes. It does this by assigning weights to the training examples; the weight of each misclassified example is increased after each round of learning, causing later rounds to focus more on it. The name boosting comes from the notion that this process can boost a classifier that’s only slightly better than random guessing, but consistently so, into one that’s almost perfect. Metalearning is remarkably successful, but it’s not a very deep way to combine models. It’s also expensive, ...more
65%
Flag icon
Markov logic networks.”
66%
Flag icon
can download the learner I’ve just described from alchemy.cs .washington.edu. We christened it Alchemy to remind ourselves that, despite all its successes, machine learning is still in the alchemy stage of science. If you do download it, you’ll see that it includes a lot more than the basic algorithm I’ve described but also that it is still missing a few things I said the universal learner ought to have, like crossover. Nevertheless, let’s use the name Alchemy to refer to our candidate universal learner for simplicity.
68%
Flag icon
semantic network
68%
Flag icon
Europe’s FuturICT project aims to build a model of—literally—the whole world.
72%
Flag icon
Companies like Acxiom collate and sell information about you, but if you inspect it (which in Acxiom’s case you can, at aboutthedata.com),
74%
Flag icon
the greatest benefit of machine learning may ultimately be not what the machines learn but what we learn by teaching them.
75%
Flag icon
The first big worry, as with any technology, is that AI could fall into the wrong hands.
75%
Flag icon
The second worry is that humans will voluntarily surrender control.
75%
Flag icon
Your job in a world of intelligent machines is to keep making sure they do what you want, both at the input (setting the goals) and at the output (checking that you got what you asked for). If you don’t, somebody else will. Machines can help us figure out collectively what we want, but if you don’t participate, you lose out—
75%
Flag icon
The third and perhaps biggest worry is that, like the proverbial genie, the machines will give us what we ask for instead of what we want.
76%
Flag icon
People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.
76%
Flag icon
age of affordable beauty, in William Gibson’s memorable words.
77%
Flag icon
five tribes of machine learning and their master algorithms: symbolists and inverse deduction; connectionists and backpropagation; evolutionaries and genetic algorithms; Bayesians and probabilistic inference; analogizers and support vector machines.
77%
Flag icon
UCI repository (archive.ics.uci.edu/ml/) and start playing. When you’re ready, check out Kaggle.com, a whole website dedicated to running machine-learning competitions, and pick one or two to enter.
Tim Moore
link good as of 2021 Jan 3
78%
Flag icon
(http://www.cs.washington.edu/homes/pedrod/class). Two other options are Andrew Ng’s course (www.coursera.org/course/ml) and Yaser Abu-Mostafa’s (http://work.caltech.edu/telecourse.html).
78%
Flag icon
Kevin Murphy’s Machine Learning: A Probabilistic Perspective* (MIT Press, 2012), Chris Bishop’s Pattern Recognition and Machine Learning* (Springer, 2006), and An Introduction to Statistical Learning with Applications in R,* by Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani (Springer, 2013).
79%
Flag icon
Weka (www.cs.waikato.ac.nz/ml/weka). The two main machine-learning journals are Machine Learning and the Journal of Machine Learning Research. Leading machine-learning conferences, with yearly proceedings, include the International Conference on Machine Learning, the Conference on Neural Information Processing Systems, and the International Conference on Knowledge Discovery and Data Mining. A large number of machine-learning talks are available on http://videolectures.net. The www.KDnuggets.com website is a one-stop shop for machine-learning resources, and you can sign up for its newsletter to ...more
« Prev 1 2 Next »