Algorithms to Live By: The Computer Science of Human Decisions
Rate it:
Open Preview
Read between September 27 - November 20, 2024
1%
Flag icon
We know this because finding an apartment belongs to a class of mathematical problems known as “optimal stopping” problems. The 37% rule defines a simple series of steps—what computer scientists call an “algorithm”—for solving these problems. And as it turns out, apartment hunting is just one of the ways that optimal stopping rears its head in daily life.
1%
Flag icon
As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.” Even in cases where life is too messy for us to expect a strict numerical analysis or a ready answer, using intuitions and concepts honed on the simpler forms of these problems offers us a way to understand the key issues and make progress.
Kris Khaira liked this
1%
Flag icon
These hard-won precepts are at odds with our intuitions about rationality, and they don’t sound anything like the narrow prescriptions of a mathematician trying to force the world into clean, formal lines. They say: Don’t always consider all your options. Don’t necessarily go for the outcome that seems best every time. Make a mess on occasion. Travel light. Let things wait. Trust your instincts and don’t think too long. Relax. Toss a coin. Forgive, but don’t forget. To thine own self be true.
3%
Flag icon
Instead, the optimal solution takes the form of what we’ll call the Look-Then-Leap Rule: You set a predetermined amount of time for “looking”—that is, exploring your options, gathering data—in which you categorically don’t choose anyone, no matter how impressive. After that point, you enter the “leap” phase, prepared to instantly commit to anyone who outshines the best applicant you saw in the look phase.
3%
Flag icon
As it turns out, following this optimal strategy ultimately gives us a 37% chance of hiring the best applicant; it’s one of the problem’s curious mathematical symmetries that the strategy itself and its chance of success work out to the very same number.
3%
Flag icon
Yet remarkably, the math of the secretary problem doesn’t change. If you’re stopping optimally, your chance of finding the single best applicant in a pool of a hundred is 37%. And in a pool of a million, believe it or not, your chance is still 37%.
4%
Flag icon
Full information means that we don’t need to look before we leap. We can instead use the Threshold Rule, where we immediately accept an applicant if they are above a certain percentile. We don’t need to look at an initial group of candidates to set this threshold—but we do, however, need to be keenly aware of how much looking remains available.
4%
Flag icon
The chance of ending up with the single best applicant in this full-information version of the secretary problem comes to 58%—still far from a guarantee, but considerably better than the 37% success rate offered by the 37% Rule in the no-information game. If you have all the facts, you can succeed more often than not, even as the applicant pool grows arbitrarily large.
5%
Flag icon
The critical thing to note in this problem is that our threshold depends only on the cost of search. Since the chances of the next offer being a good one—and the cost of finding out—never change, our stopping price has no reason to ever get lower as the search goes on, regardless of our luck. We set it once, before we even begin, and then we quite simply hold fast.
5%
Flag icon
Shoup’s solution involves installing digital parking meters that are capable of adaptive prices that rise with demand. (This has now been implemented in downtown San Francisco.)
5%
Flag icon
The prices are set with a target occupancy rate in mind, and Shoup argues that this rate should be somewhere around 85%—a radical drop from the nearly 100%-packed curbs of most major cities. As he notes, when occupancy goes from 90% to 95%, it accommodates only 5% more cars but doubles the length of everyone’s search.
6%
Flag icon
we have discussed, but there are sequential decision-making problems for which there is no optimal stopping rule. A simple example is the game of “triple or nothing.” Imagine you have $1.00, and can play the following game as many times as you want: bet all your money, and have a 50% chance of receiving triple the amount and a 50% chance of losing your entire stake. How many times should you play? Despite its simplicity, there is no optimal stopping rule for this problem, since each time you play, your average gains are a little higher.
6%
Flag icon
But this doesn’t make optimal stopping problems less important; it actually makes them more important, because the flow of time turns all decision-making into optimal stopping.
6%
Flag icon
Viewed this way, the secretary problem’s most fundamental yet most unbelievable assumption—its strict seriality, its inexorable one-way march—is revealed to be the nature of time itself. As such, the explicit premise of the optimal stopping problem is the implicit premise of what it is to be alive.
7%
Flag icon
People tend to treat decisions in isolation, to focus on finding each time the outcome with the highest expected value. But decisions are almost never isolated, and expected value isn’t the end of the story. If you’re thinking not just about the next decision, but about all the decisions you are going to make about the same options in the future, the explore/exploit tradeoff is crucial to the process.
7%
Flag icon
A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle. Discovering an enchanting café on your last night in town doesn’t give you the opportunity to return.
7%
Flag icon
So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.
8%
Flag icon
Robbins specifically considered the case where there are exactly two slot machines, and proposed a solution called the Win-Stay, Lose-Shift algorithm: choose an arm at random, and keep pulling it as long as it keeps paying off. If the arm doesn’t pay off after a particular pull, then switch to the other one.
8%
Flag icon
As so often happens in mathematics, though, the particular is the gateway to the universal.
8%
Flag icon
For every slot machine we know little or nothing about, there is some guaranteed payout rate which, if offered to us in lieu of that machine, will make us quite content never to pull its handle again. This number—which Gittins called the “dynamic allocation index,” and which the world now knows as the Gittins index—suggests an obvious strategy on the
9%
Flag icon
The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.
9%
Flag icon
The untested rookie is worth more (early in the season, anyway) than the veteran of seemingly equal ability, precisely because we know less about him. Exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.
9%
Flag icon
For one, the Gittins index is optimal only under some strong assumptions. It’s based on geometric discounting of future reward, valuing each pull at a constant fraction of the previous one, which is something that a variety of experiments in behavioral economics and psychology suggest people don’t do. And if there’s a cost to switching among
9%
Flag icon
Logarithmically increasing regret means that we’ll make as many mistakes in our first ten pulls as in the following ninety, and as many in our first year as in the rest of the decade combined. (The first decade’s mistakes, in turn, are as many as we’ll make for the rest of the century.)
9%
Flag icon
In a multi-armed bandit problem, an Upper Confidence Bound algorithm says, quite simply, to pick the option for which the top of the confidence interval is highest.
9%
Flag icon
Upper Confidence Bound algorithms implement a principle that has been dubbed “optimism in the face of uncertainty.” Optimism, they show, can be perfectly rational. By focusing on the best that an option could be, given the evidence obtained so far, these algorithms give a boost to possibilities we know less about. As a consequence, they naturally inject a dose of exploration into the decision-making process, leaping at new options with enthusiasm because any one of them could be the next big thing.
11%
Flag icon
The widespread difficulty with accepting results from adaptive clinical trials might seem incomprehensible. But consider that part of what the advent of statistics did for medicine, at the start of the twentieth century, was to transform it from a field in which doctors had to persuade each other in ad hoc ways about every new treatment into one where they had clear guidelines about what sorts of evidence were and were not persuasive.
12%
Flag icon
To live in a restless world requires a certain restlessness in oneself. So long as things continue to change, you must never fully cease exploring.
Balint Erdi
This is i the context of the "explore-exploit" dilemma
12%
Flag icon
Still, the algorithmic techniques honed for the standard version of the multi-armed bandit problem are useful even in a restless world. Strategies like the Gittins index and Upper Confidence Bound provide reasonably good approximate solutions and rules of thumb, particularly if payoffs don’t change very much over time.
12%
Flag icon
When we talk about decision-making, we usually focus just on the immediate payoff of a single decision—and if you treat every decision as if it were your last, then indeed only exploitation makes sense. But over a lifetime, you’re going to make a lot of decisions. And it’s actually rational to emphasize exploration—the new rather than the best, the exciting rather than the safe, the random rather than the considered—for many of those choices, particularly earlier in life.
12%
Flag icon
We think of the young as stereotypically fickle; the old, stereotypically set in their ways. In fact, both are behaving completely appropriately with respect to their intervals. The deliberate honing of a social network down to the most meaningful relationships is the rational response to having less time to enjoy them.
12%
Flag icon
Perhaps the deepest insight that comes from thinking about later life as a chance to exploit knowledge acquired over decades is this: life should get better over time. What an explorer trades off for knowledge is pleasure. The Gittins index and the Upper Confidence Bound, as we’ve seen, inflate the appeal of lesser-known options beyond what we actually expect, since pleasant surprises can pay off many times over. But at the same time, this means that exploration necessarily leads to being let down on most occasions.
13%
Flag icon
Shifting the bulk of one’s attention to one’s favorite things should increase quality of life. And it seems like it does: Carstensen has found that older people are generally more satisfied with their social networks, and often report levels of emotional well-being that are higher than those of younger adults.
13%
Flag icon
But with sorting, size is a recipe for disaster: perversely, as a sort grows larger, “the unit cost of sorting, instead of falling, rises.” Sorting involves steep diseconomies of scale, violating our normal intuitions about the virtues of doing things in bulk.
16%
Flag icon
And one of the most central tradeoffs is between sorting and searching. The basic principle is this: the effort expended on sorting materials is just a preemptive strike against the effort it’ll take to search through them later.
16%
Flag icon
The basic principle is this: the effort expended on sorting materials is just a preemptive strike against the effort it’ll take to search through them later.
16%
Flag icon
Sorting something that you will never search is a complete waste; searching something you never sorted is merely inefficient.
16%
Flag icon
Dodgson’s complaint was directed at the structure of the Single Elimination tournament, where players are paired off with one another and eliminated from competition as soon as they lose a single match. As Dodgson forcefully argued, the true second-best player could be any of the players eliminated by the best—not just the last-eliminated one.
16%
Flag icon
“As a mathematical fact,” he continued, “the chance that the 2nd best Player will get the prize he deserves is only 16/31sts; while the chance that the best 4 shall get their proper prizes is so small, that the odds are 12 to 1 against its happening!”
17%
Flag icon
“A 3:2 score gives the winning team only a 5-in-8 chance of actually being a better team … Personally, I don’t find this to be very impressive. Even a 6:1 blowout leaves a 7% chance that it was a statistical fluke.”
17%
Flag icon
the research of Ackley and his collaborators suggests that there may be a place for algorithms like Bubble Sort after all. Its very inefficiency—moving items only one position at a time—makes it fairly robust against noise, far more robust than faster algorithms like Mergesort, in which each comparison potentially moves an item a long way. Mergesort’s very efficiency makes it brittle. An early error in a Mergesort is like a fluke loss in the first round of a Single Elimination tournament,
17%
Flag icon
But in fact it isn’t Bubble Sort that emerges as the single best algorithm in the face of a noisy comparator. The winner of that particular honor is an algorithm called Comparison Counting Sort. In this algorithm, each item is compared to all the others, generating a tally of how many items it is bigger than. This number can then be used directly as the item’s rank.
17%
Flag icon
Put differently, if your team is eliminated early in the postseason, it’s tough luck. But if your team fails to get to the postseason, it’s tough truth.
18%
Flag icon
coming to blows every time a mating opportunity or a prime spot of grass becomes available. Though we may cringe when we see creatures turning their claws and beaks on each other, biologists tend to think of pecking orders as the violence that preempts violence. Sound familiar? It’s the search-sort tradeoff.
18%
Flag icon
This move from “ordinal” numbers (which only express rank) to “cardinal” ones (which directly assign a measure to something’s caliber) naturally orders a set without requiring pairwise comparisons. Accordingly, it makes possible dominance hierarchies that don’t require direct head-to-head matchups.
19%
Flag icon
The optimal cache eviction policy—essentially by definition, Bélády wrote—is, when the cache is full, to evict whichever item we’ll need again the longest from now.
20%
Flag icon
Bélády compared Random Eviction, FIFO, and variants of LRU in a number of scenarios and found that LRU consistently performed the closest to clairvoyance. The LRU principle is effective because of something computer scientists call “temporal locality”: if a program has called for a particular piece of information once, it’s likely to do so again in the near future.
21%
Flag icon
Their patent is actually for shipping items that have been recently popular in a given region to a staging warehouse in that region—like having their own CDN for physical goods. Then, when somebody places an order, the item is just down the street. Anticipating the purchases of individuals is challenging, but when predicting the purchases of a few thousand people, the law of large numbers kicks in. Somebody in Berkeley is going to order, say, recycled toilet
22%
Flag icon
Sleator and Tarjan’s results showed that some “very simple self-adjusting schemes, amazingly, come within a constant factor” of clairvoyance. Namely, if you follow the LRU principle—where you simply always put an item back at the very front of the list—then the total amount of time you spend searching will never be more than twice as long as if you’d known the future.
22%
Flag icon
“Many people hold the bias that human memory is anything but optimal,” wrote Anderson and Schooler. “They point to the many frustrating failures of memory. However, these criticisms fail to appreciate the task before human memory, which is to try to manage a huge stockpile of memories. In any system responsible for managing a vast data base there must be failures of retrieval. It is just too expensive to maintain access to an unbounded number of items.”
« Prev 1 3 4 5