Algorithms to Live By: The Computer Science of Human Decisions
Rate it:
Open Preview
17%
Flag icon
some of computer science’s most hallowed algorithms go out the window—and some of its most maligned have their day of redemption.
17%
Flag icon
Its very inefficiency—moving items only one position at a time—makes it fairly robust against noise, far more robust than faster algorithms like Mergesort, in which each comparison potentially moves an item a long way.
17%
Flag icon
An early error in a Mergesort is like a fluke loss in the first round of a Single Elimination tournament, which can not only dash a favored team’s championship hopes but also permanently relegate them to the bottom half of the results.* In a Ladder tournament, on the other hand, as in a Bubble Sort, a fluke loss would only set a player back a single place in the standings.
17%
Flag icon
Comparison Counting Sort operates exactly like a Round-Robin tournament.
17%
Flag icon
The Mergesort postseason is chancy, but the Comparison Counting regular season is not; championship rings aren’t robust, but divisional standings are literally as robust as it gets.
17%
Flag icon
In multi-handed poker cash games, there will often be one weak player—a wealthy amateur, for instance—feeding a table full of professionals, who then don’t much care who among them is better than whom.
17%
Flag icon
“There has to be a disagreement between you and them about who’s better—or somebody has to be willingly losing.”
18%
Flag icon
Displacement happens when an animal uses its knowledge of the hierarchy to determine that a particular confrontation simply isn’t worth it.
18%
Flag icon
Establishing an order ahead of time is less violent than coming to blows every time a mating opportunity or a prime spot of grass becomes available.
18%
Flag icon
For one, it implies that the number of hostile confrontations encountered by each individual will grow substantially—at least logarithmically, and perhaps quadratically—as the group gets bigger.
18%
Flag icon
I think there is a pretty high degree of consensus about what the list looks like.” Only when these rankings differ will cash games ensue.
18%
Flag icon
This sporting contest is the marathon, and it suggests something critical: a race is fundamentally different from a fight.
18%
Flag icon
An Olympic boxer must risk concussion O(log n) times, usually from 4 to 6, to make it to the podium; allowing a greater number of athletes into the games would imperil the health of all.
18%
Flag icon
Being able to assign a simple numerical measure of performance results in a constant-time algorithm for status. This move from “ordinal” numbers (which only express rank) to “cardinal” ones (which directly assign a measure to something’s caliber) naturally orders a set without requiring pairwise comparisons.
18%
Flag icon
Having a benchmark—any benchmark—solves the computational problem of scaling up a sort.
18%
Flag icon
Money, of course, need not be the criterion; a rule like “respect your elders,” for instance, likewise settles questions of people’s status by reference to a common quantity.
18%
Flag icon
But the existence of any benchmark at all transforms the question of national status from one demanding at least a linearithmic number of tussles and resolutions into something with a single reference point that ranks all.
19%
Flag icon
But just as the tidiness of a scholar’s desk may hide the messiness of their mind, so does the apparent tidiness of a computer’s file system obscure the highly engineered chaos of how data is actually being stored underneath the nested-folder veneer.
19%
Flag icon
In 1946, Arthur Burks, Herman Goldstine, and John von Neumann, working at the Institute for Advanced Study in
19%
Flag icon
In 1946, Arthur Burks, Herman Goldstine, and John von Neumann, working at the Institute for Advanced Study in Princeton, laid out a design proposal for what they called an electrical “memory organ.”
19%
Flag icon
Instead, the trio proposed what they believed to be the next best thing: “a hierarchy of memories, each of which has greater capacity than the preceding but which is less quickly accessible.”
19%
Flag icon
Wilkes’s proposal was implemented in the IBM 360/85 supercomputer later in the 1960s, where it acquired the name of the “cache.”
19%
Flag icon
Processors have caches. Hard drives have caches. Operating systems have caches. Web browsers have caches.
19%
Flag icon
What hasn’t improved at that rate is the performance of memory, which means that relative to processing time, the cost of accessing memory is also increasing exponentially.
19%
Flag icon
Likewise, a factory that doubles its manufacturing speed each year—but has the same number of parts shipped to it from overseas at the same sluggish pace—will mean little more than a factory that’s twice as idle.
19%
Flag icon
In the 1990s this began to be known as the “memory wall.” Computer science’s best defense against hitting that wall has been an ever more elaborate hierarchy: caches for caches for caches, all the way down.
19%
Flag icon
Bélády was born in 1928 in Hungary, where he studied as a mechanical engineer before fleeing to Germany during the 1956 Hungarian Revolution with nothing but a satchel containing “one change of underwear and my graduation paper.”
19%
Flag icon
The optimal cache eviction policy—essentially by definition, Bélády wrote—is, when the cache is full, to evict whichever item we’ll need again the longest from now.
20%
Flag icon
The LRU principle is effective because of something computer scientists call “temporal locality”: if a program has called for a particular piece of information once, it’s likely to do so again in the near future.
20%
Flag icon
But despite an abundance of innovative caching schemes, some of which can beat LRU under the right conditions, LRU itself—and minor tweaks thereof—is the overwhelming favorite of computer scientists, and is used in a wide variety of deployed applications at a variety of scales.
20%
Flag icon
Unless we have good reason to think otherwise, it seems that our best guide to the future is a mirror image of the past. The nearest thing to clairvoyance is to assume that history repeats itself—backward.
20%
Flag icon
Meanwhile, the lobby of the Moffit Undergraduate Library—the location of the most prominent and accessible shelves—showcases the library’s most recently acquired books. This is instantiating a kind of FIFO cache, privileging the items that were last added to the library, not last read. The dominant performance of the LRU algorithm in most tests that computer scientists have thrown at it leads to a simple suggestion: turn the library inside out. Put acquisitions in the back, for those who want to find them. And put the most recently returned items in the lobby, where they are ripe for the ...more
20%
Flag icon
Allowing the returned books to adorn the lobby instead would give students a chance to short-circuit the shelving process entirely. No employees would have to venture into the stacks to deposit the volumes, and no students would have to venture into the stacks to get them back out. That’s exactly how caching is meant to work.
20%
Flag icon
“We actually made a map of the country, on the scale of a mile to the mile!” “Have you used it much?” I enquired. “It has never been spread out, yet,” said Mein Herr: “the farmers objected: they said it would cover the whole country, and shut out the sunlight! So we now use the country itself, as its own map, and I assure you it does nearly as well.” —LEWIS CARROLL
20%
Flag icon
The reality is that the Internet is all about bundles of physical wires and racks of metal. And it’s much more closely tied to geography than you might expect. Engineers think about geography on a tiny scale when they design computer hardware: faster memory is usually placed closer to the processor, minimizing the length of the wires that information has to travel along.
21%
Flag icon
First, when you are deciding what to keep and what to throw away, LRU is potentially a good principle to use—much better than FIFO.
21%
Flag icon
Second, exploit geography. Make sure things are in whatever cache is closest to the place where they’re typically used.
21%
Flag icon
Where your belongings are concerned, your closet is one cache level, your basement another, and a self-storage locker a third.
21%
Flag icon
The left-side insertion rule, Noguchi specifies, has to be followed for old files as well as new ones: every time you pull out a file to use its contents, you must put it back as the leftmost file when you return it to the box. And when you search for a file, you always start from the left-hand side as well. The most recently accessed files are thus the fastest to find.
22%
Flag icon
The definitive paper on self-organizing lists, published by Daniel Sleator and Robert Tarjan in 1985, examined (in classic computer science fashion) the worst-case performance of various ways to organize the list given all possible sequences of requests.
22%
Flag icon
Namely, if you follow the LRU principle—where you simply always put an item back at the very front of the list—then the total amount of time you spend searching will never be more than twice as long as if you’d known the future. That’s not a guarantee any other algorithm can make.
22%
Flag icon
What might appear to others to be an unorganized mess is, in fact, a self-organizing mess. Tossing things back on the top of the pile is the very best you can do, shy of knowing the future. In the previous chapter we examined cases where leaving something unsorted was more efficient than taking the time to sort everything; here, however, there’s a very different reason why you don’t need to organize it. You already have.
22%
Flag icon
He confirmed, for instance, that practicing a list multiple times makes it persist longer in memory, and that the number of items one can accurately recall goes down as time passes.
23%
Flag icon
So as you age, and begin to experience these sporadic latencies, take heart: the length of a delay is partly an indicator of the extent of your experience. The effort of retrieval is a testament to how much you know. And the rarity of those lags is a testament to how well you’ve arranged it: keeping the most important things closest to hand.
24%
Flag icon
If you’re concerned with minimizing maximum lateness, then the best strategy is to start with the task due soonest and work your way toward the task due last. This strategy, known as Earliest Due Date, is fairly intuitive.
24%
Flag icon
Maybe instead we want to minimize the number of foods that spoil. Here a strategy called Moore’s Algorithm gives us our best plan. Moore’s Algorithm says that we start out just like with Earliest Due Date—by scheduling out our produce in order of spoilage date, earliest first, one item at a time. However, as soon as it looks like we won’t get to eating the next item in time, we pause, look back over the meals we’ve already planned, and throw out the biggest item (that is, the one that would take the most days to consume).
24%
Flag icon
In an industrial or bureaucratic context where you can’t simply discard a project, but in which the number—rather than the severity—of late projects is still your biggest concern, Moore’s Algorithm is just as indifferent about how those late tasks are handled. Anything booted from the main portion of your schedule can get done at the very end, in any order; it doesn’t matter, as they’re all already late.
24%
Flag icon
Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can.
24%
Flag icon
The optimal strategy for that goal is a simple modification of Shortest Processing Time: divide the weight of each task by how long it will take to finish, and then work in order from the highest resulting importance-per-unit-time (call it “density” if you like, to continue the weight metaphor) to the lowest.
24%
Flag icon
In business contexts, “weight” might easily be translated to the amount of money each task will bring in. The notion of dividing reward by duration translates, therefore, to assigning each task an hourly rate.