Algorithms to Live By: The Computer Science of Human Decisions
Rate it:
Open Preview
Read between November 19, 2023 - January 11, 2024
1%
Flag icon
class of mathematical problems known as “optimal stopping” problems. The 37% rule defines a simple series of steps—what computer scientists call an “algorithm”—for solving these problems.
1%
Flag icon
In this book, we explore the idea of human algorithm design—searching for better solutions to the challenges people encounter every day.
1%
Flag icon
Optimal stopping tells us when to look and when to leap. The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. Sorting theory tells us how (and whether) to arrange our offices. Caching theory tells us how to fill our closets. Scheduling theory tells us how to fill our time.
1%
Flag icon
As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.”
1%
Flag icon
tackling real-world tasks requires being comfortable with chance, trading off time with accuracy, and using approximations.
1%
Flag icon
Don’t always consider all your options. Don’t necessarily go for the outcome that seems best every time. Make a mess on occasion. Travel light. Let things wait. Trust your instincts and don’t think too long. Relax. Toss a coin. Forgive, but don’t forget. To thine own self be true.
2%
Flag icon
If you prefer Mr. Martin to every other person; if you think him the most agreeable man you have ever been in company with, why should you hesitate? —JANE AUSTEN, EMMA
2%
Flag icon
The nature of serial monogamy, writ large, is that its practitioners are confronted with a fundamental, unavoidable problem. When have you met enough people to know who your best match is? And what if acquiring the data costs you that very match?
2%
Flag icon
The 37% Rule* derives from optimal stopping’s most famous puzzle, which has come to be known as the “secretary problem.”
3%
Flag icon
the Look-Then-Leap Rule: You set a predetermined amount of time for “looking”—that is, exploring your options, gathering data—in which you categorically don’t choose anyone, no matter how impressive. After that point, you enter the “leap” phase, prepared to instantly commit to anyone who outshines the best applicant you saw in the look phase.
3%
Flag icon
As the applicant pool grows, the exact place to draw the line between looking and leaping settles to 37% of the pool, yielding the 37% Rule: look at the first 37% of the applicants,* choosing none, then be ready to leap for anyone better than all those you’ve seen so far.
3%
Flag icon
The passion between the sexes has appeared in every age to be so nearly the same that it may always be considered, in algebraic language, as a given quantity. —THOMAS MALTHUS
4%
Flag icon
the Threshold Rule, where we immediately accept an applicant if they are above a certain percentile.
4%
Flag icon
If you have all the facts, you can succeed more often than not, even as the applicant pool grows arbitrarily large.
4%
Flag icon
Gold digging is more likely to succeed than a quest for love.
4%
Flag icon
Any yardstick that provides full information on where an applicant stands relative to the population at large will change the solution from the Look-Then-Leap Rule to the Threshold Rule and will dramatically boost your chances of finding the single best applicant in the group.
6%
Flag icon
I expect to pass through this world but once. Any good therefore that I can do, or any kindness that I can show to any fellow creature, let me do it now. Let me not defer or neglect it, for I shall not pass this way again. —STEPHEN GRELLET
6%
Flag icon
Intuitively, we think that rational decision-making means exhaustively enumerating our options, weighing each one carefully, and then selecting the best. But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as one: when to stop.
7%
Flag icon
exploration is gathering information, and exploitation is using the information you have to get a known good result.
7%
Flag icon
In computer science, the tension between exploration and exploitation takes its most concrete form in a scenario called the “multi-armed bandit problem.”
7%
Flag icon
the explore/exploit tradeoff isn’t just a way to improve decisions about where to eat or what to listen to. It also provides fundamental insights into how our goals should change as we age, and why the most rational course of action isn’t always trying to choose the best.
7%
Flag icon
When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.
7%
Flag icon
A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle.
7%
Flag icon
the value of exploitation can only go up over time.
8%
Flag icon
Robbins specifically considered the case where there are exactly two slot machines, and proposed a solution called the Win-Stay, Lose-Shift algorithm: choose an arm at random, and keep pulling it as long as it keeps paying off. If the arm doesn’t pay off after a particular pull, then switch to the other one.
8%
Flag icon
Robbins proved in 1952 that it performs reliably better than chance.
8%
Flag icon
Win-Stay, Lose-Shift doesn’t have any notion of the interval over which you are optimizing.
8%
Flag icon
Economists refer to this idea, of valuing the present more highly than the future, as “discounting.”
8%
Flag icon
there is some guaranteed payout rate which, if offered to us in lieu of that machine, will make us quite content never to pull its handle again. This number—which Gittins called the “dynamic allocation index,” and which the world now knows as the Gittins index—suggests an obvious strategy on the casino floor: always play the arm with the highest index.*
8%
Flag icon
once the Gittins index for a particular set of assumptions is known, it can be used for any problem of that form.
8%
Flag icon
Gittins index values as a function of wins and losses, assuming that a payoff next time is worth 90% of a payoff now.
9%
Flag icon
something you have no experience with whatsoever is more attractive than a machine that you know pays out 70% of the time!
9%
Flag icon
The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.
9%
Flag icon
Exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.
9%
Flag icon
focus on regret.
9%
Flag icon
In the memorable words of management theorist Chester Barnard, “To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”
9%
Flag icon
a “regret minimization framework.”
9%
Flag icon
your total amount of regret will probably never stop increasing, even if you pick the best possible strategy—because even the best strategy isn’t perfect every time.
9%
Flag icon
regret will increase at a slower rate if you pick the best strategy than if you pick others; what’s more, with a good strategy regret’s rate of growth will go down over time, as you learn more about the problem and are able to make better choices.
9%
Flag icon
the minimum possible regret—again assuming non-omniscience—is regret that increases at a logarithmic rate...
This highlight has been truncated due to consecutive passage length restrictions.
9%
Flag icon
Logarithmically increasing regret means that we’ll make as many mistakes in our first ten pulls as in the following ninety, and as many in our first y...
This highlight has been truncated due to consecutive passage length restrictions.
9%
Flag icon
if we’re following a regret-minimizing algorithm, every year we can expect to have fewer new regrets than we did the year before.
9%
Flag icon
algorithms that offer the guarantee of minimal regret. Of the ones they’ve discovered, the most popular are known as Upper Confidence Bound algorithms.
9%
Flag icon
Visual displays of statistics often include so-called error bars that extend above and below any data point, indicating uncertainty in the measurement; the error bars show the range of plausible values that the quantity being measured could actually have. This range is known as the “confidence interval,” and as we gain more data about something the confidence interval will shrink, reflecting an increasingly accurate assessment.
9%
Flag icon
In a multi-armed bandit problem, an Upper Confidence Bound algorithm says, quite simply, to pick the option for which the top of the confidence interval is highest.
9%
Flag icon
Upper Confidence Bound algorithms assign a single number to each arm of the multi-armed bandit. And that number is set to the highest value that the arm could reasonably have, based on the information available so far. So an Upper Confidence Bound algorithm doesn’t care which arm has performed best so far; instead, it chooses the arm that could reasonably perform best in the future.
9%
Flag icon
the Upper Confidence Bound is always greater than the expected value, but by less and less as we gain more experience with a particular option.
9%
Flag icon
Upper Confidence Bound algorithms implement a principle that has been dubbed “optimism in the face of uncertainty.”
10%
Flag icon
they naturally inject a dose of exploration into the decision-making process, leaping at new options with enthusiasm because any one of them could be the next big thing.
10%
Flag icon
In the long run, optimism is the best prevention for regret.
« Prev 1 3 8