More on this book
Community
Kindle Notes & Highlights
Read between
April 23 - September 15, 2018
In other words, the ability to resist temptation may be, at least in part, a matter of expectations rather than willpower.
Failing the marshmallow test—and being less successful in later life—may not be about lacking willpower. It could be a result of believing that adults are not dependable: that they can’t be trusted to keep their word, that they disappear for intervals of arbitrary length. Learning self-control is important, but it’s equally important to grow up in an environment where adults are consistently present and trustworthy.
Even when we accumulate biases that aren’t objectively correct, they still usually do a reasonable job of reflecting the specific part of the world we live in.
Everything starts to break down, however, when a species gains language. What we talk about isn’t what we experience—we speak chiefly of interesting things, and those tend to be things that are uncommon.
it skews the statistics of our experience. That makes it hard to maintain appropriate prior distributions.
the murder rate in the United States declined by 20% over the course of the 1990s, yet during that time period the presence of gun violence on American news increased by 600%.
if you want to naturally make good predictions, without having to think about what kind of prediction rule is appropriate—you need to protect your priors. Counterintuitively, that might mean turning off the news.
When we think about thinking, it’s easy to assume that more is better: that you will make a better decision the more pros and cons you list, make a better prediction about the price of a stock the more relevant factors you identify, and write a better report the more time you spend working on it. This
The question of how hard to think, and how many factors to consider, is at the heart of a knotty problem that statisticians and machine-learning researchers call “overfitting.”
And dealing with that problem reveals that there’s a wisdom to deliberately thinking less. Being aware of overfitting changes how we should approach the market, the dining table, the gym … and the altar.
So one of the deepest truths of machine learning is that, in fact, it’s not always better to use a more complex model, one that takes a greater number of factors into account. And the issue is not just that the extra factors might offer diminishing returns—performing better than a simpler model, but not enough to justify the added complexity. Rather, they might make our predictions dramatically worse.
If you can’t explain it simply, you don’t understand it well enough.
Techniques like the Lasso are now ubiquitous in machine learning, but the same kind of principle—a penalty for complexity—also appears in nature.
Language forms yet another natural Lasso: complexity is punished by the labor of speaking at greater length and the taxing of our listener’s attention span. Business plans get compressed to an elevator pitch; life advice becomes proverbial wisdom only if it is sufficiently concise and catchy. And anything that needs to be remembered has to pass through the inherent Lasso of memory.
“In contrast to the widely held view that less processing reduces accuracy,” they write, “the study of heuristics shows that less information, computation, and time can in fact improve accuracy.” A heuristic that favors simpler answers—with fewer factors, or less computation—offers precisely these “less is more” effects.
This kind of setup—where more time means more complexity—characterizes a lot of human endeavors. Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision.
But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting.
nitty-gritty details
If you have high uncertainty and limited data, then do stop early by all means. If you don’t have a clear read on how your work will be evaluated, and by whom, then it’s not worth the extra time to make it perfect with respect to your own (or anyone else’s) idiosyncratic guess at what perfection might be. The greater the uncertainty, the bigger the gap between what you can measure and what matters, the more you should watch out for overfitting—that is, the more you should prefer simplicity, and the earlier you should stop.
When you’re truly in the dark, the best-laid plans will be the simplest. When
When we start designing something, we sketch out ideas with a big, thick Sharpie marker, instead of a ball-point pen. Why? Pen points are too fine. They’re too high-resolution. They encourage you to worry about things that you shouldn’t worry about yet, like perfecting the shading or whether to use a dotted or dashed line. You end up focusing on things that should still be out of focus. A Sharpie makes it impossible to drill down that deep. You can only draw shapes, lines, and boxes. That’s good. The big picture is all you should be worrying about in the beginning.
“Never mind, trust to chance”—
the molecules’ so-called nearest-neighbor interactions became—well—nearest-neighbor interactions.
She also codified the program’s goal: to maximize the relationship scores between the guests and their tablemates.
sorting a deck of cards by throwing them in the air until they happen to land in order.
how to best approach problems whose optimal answers are out of reach. How to relax.
Just Relax The perfect is the enemy of the good. —VOLTAIRE
Finding the shortest route under these looser rules produces what’s called the “minimum spanning tree.”
(If you prefer, you can also think of the minimum spanning tree as the fewest miles of road needed to connect every town to at least one other town.
As it turns out, solving this looser problem takes a computer essentially no time at all. And
If you can’t solve the problem in front of you, solve an easier version of it—and then see if that solution offers you a starting point, or a beacon, in the full-blown problem. Maybe it does.
In cities, for example, planners try to place fire trucks so that every house can be reached within a fixed amount of time—say, five minutes. Mathematically,
but it’s exactly the kind of problem that political campaign managers and corporate marketers want to solve to spread their message most effectively.
is to ask how good this solution is compared to the actual best solution we might have come up with by exhaustively checking every single possible answer to the original problem.
Lagrangian Relaxations are a huge part of the theoretical literature
Generally, when people first come to us with a sports schedule, they will claim … “We never do x and we never do y.” Then we look at their schedules and we say, “Well, twice you did x and three times you did y last year.” Then “Oh, yeah, well, okay. Then other than that we never do it.” And then we go back the year before.… We generally realize that there are some things they think they never do that people do do. People in baseball believe that the Yankees and the Mets are never at home at the same time. And it’s not true. It’s never been true. They are at home perhaps three games, perhaps
...more
As Trick says, rather than spending eons searching for an unattainable perfect answer, using Lagrangian Relaxation allows him to ask questions like,
“the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.”
In scheduling theory, as we saw in chapter 5, a greedy algorithm—for instance, always doing the shortest job available, without looking or planning beyond—can sometimes be all that a problem requires.
Hill Climbing—since the search through a space of solutions, some better and some worse, is commonly thought of in terms of a landscape with hills and valleys, where your goal is to reach the highest peak.
You can know that you’re standing on a mountaintop because the ground falls away in all directions—but there might be a higher mountain just across the next valley, hidden behind clouds.
But there’s also a third approach: instead of turning to full-bore randomness when you’re stuck, use a little bit of randomness every time you make a decision. This technique, developed by the same Los Alamos team that came up with the Monte Carlo Method, is called the Metropolis Algorithm.
Wikipedia, for instance, offers a “Random article” link, and Tom has been using it as his browser’s default homepage for several years, seeing a randomly selected Wikipedia entry each time he opens a new window. While this hasn’t yet resulted in any striking discoveries, he now knows a lot about some obscure topics (such as the kind of knife used by the Chilean armed forces) and he feels that some of these have enriched his life. (For example, he’s learned that there is a word in Portuguese for a “vague and constant desire for something that does not and probably cannot exist,” a problem we
...more
The cult classic 1971 novel The Dice Man by Luke Rhinehart (real name: George Cockcroft) provides a cautionary tale.
But perhaps it’s just a case of a little knowledge being a dangerous thing. If the Dice Man had only had a deeper grasp of computer science, he’d have had some guidance. First, from Hill Climbing: even if you’re in the habit of sometimes acting on bad ideas, you should always act on good ones. Second, from the Metropolis Algorithm: your likelihood of following a bad idea should be inversely proportional to how bad an idea it is. Third, from Simulated Annealing: you should front-load randomness, rapidly cooling out of a totally random state, using ever less and less randomness as time goes on,
...more
In packet switching, on the other hand, the proliferation of paths in a growing network becomes a virtue: there are now that many more ways for data to flow, so the reliability of the network increases exponentially with its size.