Algorithms to Live By: The Computer Science of Human Decisions
Rate it:
Open Preview
13%
Flag icon
Hollerith’s firm merged with several others in 1911 to become the Computing-Tabulating-Recording Company. A few years later it was renamed—to International Business Machines, or IBM.
13%
Flag icon
The first code ever written for a “stored program” computer was a program for efficient sorting. In fact, it was the computer’s ability to outsort IBM’s dedicated card-sorting machines that convinced the US government their enormous financial investment in a general-purpose machine was justified.
13%
Flag icon
Whether it’s finding the largest or the smallest, the most common or the rarest, tallying, indexing, flagging duplicates, or just plain looking for the thing you want, they all generally begin under the hood with a sort.
13%
Flag icon
What makes Google so dominant as a means of accessing the world’s information is less that it finds our text within hundreds of millions of webpages—its 1990s competitors could generally do that part well enough—but that it sorts those webpages so well, and only shows us the most relevant ten.
13%
Flag icon
But with sorting, size is a recipe for disaster: perversely, as a sort grows larger, “the unit cost of sorting, instead of falling, rises.”
13%
Flag icon
Computers, though, must routinely sort millions of items in a single go. For that, as the line from Jaws puts it, we’re going to need a bigger boat—and a better algorithm.
14%
Flag icon
This number, a bit over 80 unvigintillion, is 52 factorial, or “52!” in mathematical notation—the number of ways that a deck of 52 cards can possibly be ordered. By taking roughly that many attempts, sooner or later we are bound to start with a shuffled deck that is in fact completely sorted by chance.
14%
Flag icon
The fine folks at Guinness care only about best-case performance (and beer).
14%
Flag icon
Computer science, however, almost never cares about the best case.
14%
Flag icon
Moreover, a computer scientist would want to know the worst sort time. Worst-case analysis lets us make hard guarantees: that a critical process will finish in time, that deadlines won’t be blown.
14%
Flag icon
That is, rather than expressing an algorithm’s performance in minutes and seconds, Big-O notation provides a way to talk about the kind of relationship that holds between the size of the problem and the program’s running time.
14%
Flag icon
This is the rosiest class of problems there is: called “Big-O of one,” written O(1), it is also known as “constant time.”
14%
Flag icon
What’s more, the existence of any linear-time factors will, in Big-O notation, swamp all constant-time factors.
14%
Flag icon
If that seems crazy, remember that computers deal with values of n that could easily be in the thousands, millions, or billions.
14%
Flag icon
This turns out to be “Big-O of n-squared,” written O(n2) and also known as “quadratic time.”
14%
Flag icon
There’s “exponential time,” O(2n), where each additional guest doubles your work. Even worse is “factorial time,” O(n!), a class of problems so truly hellish that computer scientists only talk about it when they’re joking—as we were in imagining shuffling a deck until it’s sorted—or when they really, really wish they were.
14%
Flag icon
In this way computer scientists are glimpsing God’s blueprints every bit as much as the particle physicists and cosmologists.
14%
Flag icon
Well, even just confirming that a shelf of n books is sorted cannot be done in constant time, since it requires checking all n of them.
15%
Flag icon
In 1936, IBM began producing a line of machines called “collators” that could merge two separately ordered stacks of cards into one.
15%
Flag icon
also has real applications in small-scale domestic sorting problems. Part of the reason why it’s so widely used is that it can easily be parallelized.
15%
Flag icon
It’s been proven that if we want to fully sort n items via a series of head-to-head comparisons, there’s just no way to compare them any fewer than O(n log n) times.
15%
Flag icon
Here’s the kicker: if you want to group n items into m buckets, the grouping can be done in O(nm) time—that is, the time is simply proportional to the number of items times the number of buckets. And as long as the number of buckets is relatively small compared to the number of items, Big-O notation will round that to O(n), or linear time.
15%
Flag icon
Instead, a good strategy—ratified by human and machine librarians alike—is to Bucket Sort until you get down to small enough piles that Insertion Sort is reasonable, or to have a Mergesort pizza party.
16%
Flag icon
The basic principle is this: the effort expended on sorting materials is just a preemptive strike against the effort it’ll take to search through them later.
16%
Flag icon
Err on the side of messiness. Sorting something that you will never search is a complete waste; searching something you never sorted is merely inefficient.
16%
Flag icon
For most domestic bookshelves, almost none of the conditions that make sorting worthwhile are true.
16%
Flag icon
What’s more, we search with our quick eyes and sort with slow hands.
16%
Flag icon
mathematician—in fact, he’s barely remembered as having been one. Today he’s best known by his pen name, Lewis Carroll, under which he wrote Alice’s Adventures in Wonderland and many other beloved works of nineteenth-century literature.
16%
Flag icon
Dodgson’s complaint was directed at the structure of the Single Elimination tournament, where players are paired off with one another and eliminated from competition as soon as they lose a single match. As Dodgson forcefully argued, the true second-best player could be any of the players eliminated by the best—not just the last-eliminated one.
16%
Flag icon
Said plainly, the silver medal is a lie.
16%
Flag icon
One of the most familiar algorithms in sports is the Round-Robin format, where each of n teams eventually plays every one of the other n − 1 teams.
16%
Flag icon
Having every team grapple with every other team is like having guests exchange hugs at our dinner party: the dreaded O(n2), quadratic time. Ladder tournaments—popular in sports like badminton, squash, and racquetball—put players in a linear ranking, with each player allowed to issue a direct challenge to the player immediately above them, exchanging places if they prevail. Ladders are the Bubble Sorts of the athletic world and are thus also quadratic, requiring O(n2) games to reach a stable ranking. Perhaps the most prevalent tournament format, however, is a bracket tournament—as in the famous ...more
16%
Flag icon
March Madness,” among many others. The March Madness tournament progresses from the “Round of 64”...
This highlight has been truncated due to consecutive passage length restrictions.
16%
Flag icon
and the “Round of 32” to the “Sweet 16,” “Elite Eight,” “Final Four,” and the fi...
This highlight has been truncated due to consecutive passage length restrictions.
16%
Flag icon
Sweet 16,” “Elite Eight,” “Final Four,” and the finals. Each round divides the field in half: does that sound familiarly logarithmic? These tournaments are effectively Mergesort, beginning with unsor...
This highlight has been truncated due to consecutive passage length restrictions.
16%
Flag icon
improvement: algorithm design at work. Six rounds of March Madness sounds about right, but wait a second: 192 games? The NCAA tournament is only 63 games long. In fact,
16%
Flag icon
fact, March
16%
Flag icon
The advantage is that it runs in linear time: since every game eliminates exactly one team, in order to have one team left standing you need just n − 1 games—a linear number. The disadvantage is that, well, you never really figure out the standings aside from first place. Ironically, in Single Elimination no tournament structure is actually necessary at all. Any 63 games will yield a single undefeated champion.
16%
Flag icon
is that it runs in linear time: since every game eliminates exactly one team, in order to have one team left standing you need just n − 1 games—a linear number. The disadvantage is that, well, you never really figure out the standings aside from first place. Ironically, in Single Elimination no tournament structure is actually necessary at all. Any 63 games will yield a single undefeated champion. For instance, you could simply have a single “king of the hill” team take on challengers one by one until it is dethroned, at which point whoever defeated it takes over its spot and continues.
17%
Flag icon
This format would have the drawback of needing 63 separate rounds, however, as games couldn’t happen in parallel; also, one team could potentially have to play as many as 63 games in a row,...
This highlight has been truncated due to consecutive passage length restrictions.
17%
Flag icon
As Trick points out, sports leagues aren’t concerned with determining the rankings as quickly and expeditiously as possible. Instead, sports calendars are explicitly designed to maintain tension throughout the season, something that has rarely been a concern of sorting theory.
17%
Flag icon
everybody. Why do they do n2 in order to just get, in some sense, the top, if that’s all they care
17%
Flag icon
In other words, why do a full O(n2) Round-Robin and then some, if we know we can do a full sort in linearithmic time, and can crown an undefeated Single Elimination champion in less than n games? Well, minimizing the number of games isn’t actually in the league’s interest. In computer science unnecessary comparisons are always bad, a waste of time and effort. But in sports that’s far from the case. In many respects, after all, the games themselves are the point.
17%
Flag icon
O(n
This highlight has been truncated due to consecutive passage length restrictions.
17%
Flag icon
sort in linearithmic time, and can crown an undefeated Single Elimination champion in less than n games? Well, minimizing the number of games isn’t actually in the league’
17%
Flag icon
Griping Rights: Noise and Robustness
17%
Flag icon
are won by the stronger team 70% of the time, and winning the tournament involves prevailing in
17%
Flag icon
6 straight games, then the best team has only a 0.70 to the 6th power—less than 12%—chance of winning the tournament! Put another way,
17%
Flag icon
tournament would crown the league’s truly best team just once a decade. It may be that in some sport...
This highlight has been truncated due to consecutive passage length restrictions.
17%
Flag icon
imagine. “A 3:2 score gives the winning team only a 5-in-8 chance of actually being a better team … Personally, I don’t find this to be very impressive. Even a 6:1 blowout leaves a