Algorithms to Live By: The Computer Science of Human Decisions
Rate it:
Open Preview
35%
Flag icon
Such problems can’t simply be dismissed as a failure to achieve management goals. Rather, they are the opposite: the ruthless and clever optimization of the wrong thing.
35%
Flag icon
Simply put, Cross-Validation means assessing not only how well a model fits the data it’s given, but how well it generalizes to data it hasn’t seen. Paradoxically, this may involve using less data.
35%
Flag icon
Aside from withholding some of the available data points, it is also useful to consider testing the model with data derived from some other form of evaluation entirely.
37%
Flag icon
In machine learning, the advantages of moving slowly emerge most concretely in a regularization technique known as Early Stopping.
37%
Flag icon
Many prediction algorithms, for instance, start out by searching for the single most important factor rather than jumping to a multi-factor model. Only after finding that first factor do they look for the next most important factor to add to the model, then the next, and so on.
37%
Flag icon
Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting.
37%
Flag icon
If you have high uncertainty and limited data, then do stop early by all means. If you don’t have a clear read on how your work will be evaluated, and by whom, then it’s not worth the extra time to make it perfect with respect to your own (or anyone else’s) idiosyncratic guess at what perfection might be.
38%
Flag icon
This is an instance of what’s known to mathematicians and computer scientists as a “constrained optimization” problem: how to find the single best arrangement of a set of variables, given particular rules and a scorekeeping measure.
38%
Flag icon
In the traveling salesman problem, the question isn’t whether a computer (or a mathematician) could find the shortest route: theoretically, one can simply crank out a list of all the possibilities and measure each one. Rather, the issue is that as the number of towns grows, the list of possible routes connecting them explodes. A route is just an ordering of the towns, so trying them all by brute force is the dreaded O(n!) “factorial time”—the computational equivalent of sorting a deck of cards by throwing them in the air until they happen to land in order.
38%
Flag icon
They asserted what’s now known as the Cobham–Edmonds thesis: an algorithm should be considered “efficient” if it runs in what’s called “polynomial time”—that is, O(n2), O(n3), or in fact n to the power of any number at all. A problem, in turn, is considered “tractable” if we know how to solve it using an efficient algorithm. A problem we don’t know how to solve in polynomial time, on the other hand, is considered “intractable.” And at anything but the smallest scales, intractable problems are beyond the reach of solution by computers, no matter how powerful.*
39%
Flag icon
If you can’t solve the problem in front of you, solve an easier version of it—and then see if that solution offers you a starting point, or a beacon, in the full-blown problem. Maybe it does. What relaxation cannot do is offer you a guaranteed shortcut to the perfect answer. But computer science can also quantify the tradeoff that relaxation offers between time and solution quality. In many cases, the ratio is dramatic, a no-brainer—for instance, an answer at least half as good as the perfect solution in a quadrillionth of the time.
39%
Flag icon
kind of optimization problem known as “discrete optimization”—that is, there’s no smooth continuum among its solutions. The salesman goes either to this town or to that one; you’re either at table five or at table six.
39%
Flag icon
discrete optimization’s commitment to whole numbers—a fire department can have one engine in the garage, or two, or three, but not two and a half fire trucks, or π of them—is what makes discrete optimization problems so hard to solve. In fact, both the fire truck problem and the party invitation problem are intractable:
39%
Flag icon
there do exist a number of efficient strategies for solving the continuous versions of these problems, where any fraction or decimal is a possible solution. Researchers confronted with a discrete optimization problem might gaze at those strategies enviously—but they also can do more than that. They can try to relax their discrete problem into a continuous one and see what happens.
39%
Flag icon
With the relaxed solution in hand, we can decide how to translate those fractions back into reality. We could, for example, choose to simply round them as necessary, sending invitations to everyone who got “half an invitation” or more in the relaxed scenario. We could also interpret these fractions as probabilities
39%
Flag icon
turns out that for the invitations problem, Continuous Relaxation with rounding will give us an easily computed solution that’s not half bad: it’s mathematically guaranteed to get everyone you want to the party while sending out at most twice as many invitations as the best solution obtainable by brute force.
39%
Flag icon
“You don’t have to do what your teachers tell you. You don’t have to do what I tell you. You don’t even have to obey the law. There are consequences to everything, and you get to decide whether you want to face those consequences.”
40%
Flag icon
rather than spending eons searching for an unattainable perfect answer, using Lagrangian Relaxation allows him to ask questions like, “How close can you get?” Close enough, it turns out, to make everyone happy
40%
Flag icon
there are cases where randomized algorithms can produce good approximate answers to difficult questions faster than all known deterministic algorithms. And while they do not always guarantee the optimal solutions, randomized algorithms can get surprisingly close to them in a fraction of the time,
40%
Flag icon
randomized approaches can outperform even the best deterministic ones. Sometimes the best solution to a problem is to turn to chance rather than trying to fully reason out an answer. But merely knowing that randomness can be helpful isn’t good enough. You need to know when to rely on chance, in what way, and to what extent.
41%
Flag icon
he doesn’t necessarily mean that sampling will offer you more precise answers than exhaustive analysis: there will always be some error associated with a sampling process, though you can reduce it by ensuring your samples are indeed random and by taking more and more of them. What he means is that sampling is better because it gives you an answer at all, in cases where nothing else will.
41%
Flag icon
Metropolis named this approach—replacing exhaustive probability calculations with sample simulations—the Monte Carlo Method,
41%
Flag icon
modern encryption, for instance, secret primes known only to the sender and recipient get multiplied together to create huge composite numbers that can be transmitted publicly without fear, since factoring the product would take any eavesdropper way too long to be worth attempting. Thus virtually all secure communication online—be it commerce, banking, or email—begins with a hunt for prime numbers.
42%
Flag icon
In practice, modern cryptographic systems, the ones that encrypt Internet connections and digital transactions, are tuned for a false positive rate of less than one in a million billion billion. In other words, that’s a decimal that begins with twenty-four zeros—less than one false prime for the number of grains of sand on Earth.
42%
Flag icon
Though you may have never heard of the Miller-Rabin test, your laptop, tablet, and phone know it well. Several decades after its discovery, it is still the standard method used to find and check primes in many domains. It’s working behind the scenes whenever you use your credit card online, and almost any time secure communications are sent through the air or over wires.
42%
Flag icon
Computer science gives us a way to articulate the complexity of evaluating all possible social provisions for something like an injured shin. But fortunately, it also provides tools for dealing with that complexity. And the sampling-based Monte Carlo algorithms are some of the most useful approaches in that toolbox.
42%
Flag icon
A close examination of random samples can be one of the most effective means of making sense of something too complex to be comprehended directly.
42%
Flag icon
sampling offers one of the simplest, and also the best, ways of cutting through the difficulties.
43%
Flag icon
Time and space are at the root of the most familiar tradeoffs in computer science, but recent work on randomized algorithms shows that there’s also another variable to consider: certainty.
44%
Flag icon
In physics, what we call “temperature” is really velocity—random motion at the molecular scale.
46%
Flag icon
The foundation of human connection is protocol—a shared convention of procedures and expectations, from handshakes and hellos to etiquette, politesse, and the full gamut of social norms. Machine connection is no different. Protocol is how we get on the same page;
46%
Flag icon
But with the Internet, computers became not only the conduit but also the endpoints: the ones doing the talking. As such, they’ve needed to be responsible for solving their own communication issues.
46%
Flag icon
What we now think of as “the Internet” is actually a collection of many protocols, but the chief among them (so much so that it’s often referred to more or less synonymously with the Internet) is what’s known as Transmission Control Protocol, or TCP.
46%
Flag icon
In a packet-switched network, rather than using a dedicated channel for each connection, senders and receivers atomize their messages into tiny shards known as “packets,” and merge them into the communal flow of data—a bit like postcards moving at the speed of light.
46%
Flag icon
In packet switching, on the other hand, the proliferation of paths in a growing network becomes a virtue: there are now that many more ways for data to flow, so the reliability of the network increases exponentially with its size.
46%
Flag icon
In TCP, a failure generally leads to retransmission rather than death, so it’s considered enough for a session to begin with what’s called a “triple handshake.” The visitor says hello, the server acknowledges the hello and says hello back, the visitor acknowledges that, and if the server receives this third message, then no further confirmation is needed and they’re off to the races.
46%
Flag icon
online, packet delivery is confirmed by what are called acknowledgment packets, or ACKs. These are critical to the functioning of the network. The way that ACKs work is both simple and clever. Behind the scenes of the triple handshake, each machine provides the other with a kind of serial number—and it’s understood that every packet sent after that will increment those serial numbers by one each time, like checks in a checkbook.
47%
Flag icon
The breakthrough turned out to be increasing the average delay after every successive failure—specifically, doubling the potential delay before trying to transmit again. So after an initial failure, a sender would randomly retransmit either one or two turns later; after a second failure, it would try again anywhere from one to four turns later; a third failure in a row would mean waiting somewhere between one and eight turns, and so on. This elegant approach allows the network to accommodate potentially any number of competing signals. Since the maximum delay length (2, 4, 8, 16…) forms an ...more
48%
Flag icon
Exponential Backoff is also a critical part of networking security, when successive password failures in logging into an account are punished by an exponentially increasing lockout period.
48%
Flag icon
Packet switching is radically different. The phone system gets full; the mail system gets slow.
48%
Flag icon
At the heart of TCP congestion control is an algorithm called Additive Increase, Multiplicative Decrease, or AIMD.
48%
Flag icon
Under AIMD, any fully received batch of packets causes the number of packets in flight not to double but merely to increase by 1, and dropped packets cause the transmission rate to cut back by half
50%
Flag icon
Fundamentally, buffers use delay—known in networking as “latency”—in order to maximize throughput. That is, they cause packets (or customers) to wait, to take advantage of later periods when things are slow. But a buffer that’s operating permanently full gives you the worst of both worlds: all the latency and none of the give.
51%
Flag icon
Optimal stopping problems spring from the irreversibility and irrevocability of time; the explore/exploit dilemma, from time’s limited supply. Relaxation and randomization emerge as vital and necessary strategies for dealing with the ineluctable complexity of challenges like trip planning and vaccinations.
51%
Flag icon
the value of a stock isn’t what people think it’s worth but what people think people think it’s worth.
52%
Flag icon
Game theory covers an incredibly broad spectrum of scenarios of cooperation and competition, but the field began with those resembling heads-up poker: two-person contests where one player’s gain is another player’s loss. Mathematicians analyzing these games seek to identify a so-called equilibrium: that is, a set of strategies that both players can follow such that neither player would want to change their own play, given the play of their opponent.
52%
Flag icon
the Nash equilibrium offers a prediction of the stable long-term outcome of any set of rules or incentives. As such, it provides an invaluable tool for both predicting and shaping economic policy, as well as social policy in general.
52%
Flag icon
In a game-theory context, knowing that an equilibrium exists doesn’t actually tell us what it is—or how to get there.
52%
Flag icon
in games of real-world complexity it’s now clear we cannot take for granted that the participants will be able to discover or reach the game’s equilibrium. This, in turn, means that the game’s designers can’t necessarily use the equilibrium to predict how the players will behave.
52%
Flag icon
Even when we can reach an equilibrium, just because it’s stable doesn’t make it good. It may seem paradoxical, but the equilibrium strategy—where neither player is willing to change tack—is by no means necessarily the strategy that leads to the best outcomes for the players.