More on this book
Community
Kindle Notes & Highlights
Read between
March 28 - April 18, 2025
The therapist tells them to find the right, comfortable balance between impulsivity and overthinking. The algorithm tells them the balance is thirty-seven percent.
But an algorithm is just a finite sequence of steps used to solve a problem, and algorithms are much broader—and older by far—than the computer.
The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites.
“Science is a way of thinking much more than it is a body of knowledge.”
Instead, tackling real-world tasks requires being comfortable with chance, trading off time with accuracy, and using approximations.
Look-Then-Leap Rule: You set a predetermined amount of time for “looking”—that is, exploring your options, gathering data—in which you categorically don’t choose anyone, no matter how impressive. After that point, you enter the “leap” phase, prepared to instantly commit to anyone who outshines the best applicant you saw in the look phase.
Thus the bigger the applicant pool gets, the more valuable knowing the optimal algorithm becomes. It’s true that you’re unlikely to find the needle the majority of the time, but optimal stopping is your best defense against the haystack, no matter how large.
in the face of slim pickings, lower your standards. It also makes clear the converse: with more fish in the sea, raise them. In both cases, crucially, the math tells you exactly by how much.
the explicit premise of the optimal stopping problem is the implicit premise of what it is to be alive.
Hesitation—inaction—is just as irrevocable as action.
“Eat, drink, and be merry, for tomorrow we die,” but perhaps we should also have its inverse: “Start learning a new language or an instrument, and make small talk with a stranger, because life is long, and who knows what joy could blossom over many years’ time.”
When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.
So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.
The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring. The old adage tells us that “the grass is always greener on the other side of the fence,” but the math tells us why: the unknown has a chance of being better, even if we actually expect it to be no different, or if it’s just as likely to be worse.
“To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”
of minimal regret. Of the ones they’ve discovered, the most popular are known as Upper Confidence Bound algorithms.
The success of Upper Confidence Bound algorithms offers a formal justification for the benefit of the doubt. Following the advice of these algorithms, you should be excited to meet new people and try new things—to assume the best about them, in the absence of evidence to the contrary. In the long run, optimism is the best prevention for regret.
In general, it seems that people tend to over-explore—to favor the new disproportionately over the best.
the probabilities of a payoff on the different arms change over time—what has been termed a “restless bandit”—the problem becomes much harder.
To live in a restless world requires a certain restlessness in oneself. So long as things continue to change, you must never fully cease exploring.