More on this book
Community
Kindle Notes & Highlights
Read between
November 19, 2023 - January 11, 2024
The economist Harry Markowitz won the 1990 Nobel Prize in Economics for developing modern portfolio theory: his groundbreaking “mean-variance portfolio optimization” showed how an investor could make an optimal allocation among various funds and assets to maximize returns at a given level of risk.
When it comes to portfolio management, it turns out that unless you’re highly confident in the information you have about the markets, you may actually be better off ignoring that information altogether.
psychologists Gerd Gigerenzer and Henry Brighton have argued that the decision-making shortcuts people use in the real world are in many cases exactly the kind of thinking that makes for good decisions.
“In contrast to the widely held view that less processing reduces accuracy,” they write, “the study of heuristics shows that less information, computation, and time can in fact improve accuracy.”
As a species, being constrained by the past makes us less perfectly adjusted to the present we know but helps keep us robust for the future we don’t.
Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting.
The effectiveness of regularization in all kinds of machine-learning tasks suggests that we can make better decisions by deliberately thinking and doing less.
The greater the uncertainty, the bigger the gap between what you can measure and what matters, the more you should watch out for overfitting—that is, the more you should prefer simplicity, and the earlier you should stop.
“it is intolerable to think of spending one’s whole life like a neuter bee, working, working, & nothing after all.”
Darwin made up his mind exactly when his notes reached the bottom of the diary sheet. He was regularizing to the page. This is reminiscent of both Early Stopping and the Lasso: anything that doesn’t make the page doesn’t make the decision.
no one understands as well as a computer scientist that in the face of a seemingly unmanageable challenge, you should neither toil forever nor give up, but—as we’ll see—try a third thing entirely.
what’s known to mathematicians and computer scientists as a “constrained optimization” problem: how to find the single best arrangement of a set of variables, given particular rules and a scorekeeping measure.
the Cobham–Edmonds thesis: an algorithm should be considered “efficient” if it runs in what’s called “polynomial time”—that is, O(n2), O(n3), or in fact n to the power of any number at all. A problem, in turn, is considered “tractable” if we know how to solve it using an efficient algorithm. A problem we don’t know how to solve in polynomial time, on the other hand, is considered “intractable.” And at anything but the smallest scales, intractable problems are beyond the reach of solution by computers, no matter how powerful.*
One of the simplest forms of relaxation in computer science is known as Constraint Relaxation. In this technique, researchers remove some of the problem’s constraints and set about solving the problem they wish they had. Then, after they’ve made a certain amount of headway, they try to add the constraints back in.
The idea behind Lagrangian Relaxation is simple. An optimization problem has two parts: the rules and the scorekeeping. In Lagrangian Relaxation, we take some of the problem’s constraints and bake them into the scoring system instead. That is, we take the impossible and downgrade it to costly.
There are many ways to relax a problem, and we’ve seen three of the most important. The first, Constraint Relaxation, simply removes some constraints altogether and makes progress on a looser form of the problem before coming back to reality. The second, Continuous Relaxation, turns discrete or binary choices into continua: when deciding between iced tea and lemonade, first imagine a 50–50 “Arnold Palmer” blend and then round it up or down. The third, Lagrangian Relaxation, turns impossibilities into mere penalties, teaching the art of bending the rules (or breaking them and accepting the
...more
even when you don’t commit the infraction, simply imagining it can be illuminating.
Unless we’re willing to spend eons striving for perfection every time we encounter a hitch, hard problems demand that instead of spinning our tires we imagine easier versions and tackle those first. When applied correctly, this is not just wishful thinking, not fantasy or idle daydreaming. It’s one of our best ways of making progress.
I must admit that after many years of work in this area, the efficacy of randomness for so many algorithmic problems is absolutely mysterious to me. It is efficient, it works; but why and how is absolutely mysterious. —MICHAEL RABIN
a profound general truth: when we want to know something about a complex quantity, we can estimate its value by sampling from it.
This is intellectually surprising, and if not exactly humiliating, it gives one a feeling of modesty about the limits of rational or traditional thinking. In a sufficiently complicated problem, actual sampling is better than an examination of all the chains of possibilities.
sampling can succeed where analysis fails—
Metropolis named this approach—replacing exhaustive probability calculations with sample simulations—the Monte Carlo Method,
the most surprising realization about the power of randomness is that it can be used in situations where chance seemingly plays no role at all.
virtually all secure communication online—be it commerce, banking, or email—begins with a hunt for prime numbers.
The Miller-Rabin primality test, as it’s now known, provides a way to quickly identify even gigantic prime numbers with an arbitrary degree of certainty.
At once it struck me what quality went to form a Man of Achievement, especially in Literature, and which Shakespeare possessed so enormously—I mean Negative Capability, that is, when a man is capable of being in uncertainties, mysteries, doubts, without any irritable reaching after fact and reason. —JOHN KEATS
There is no such thing as absolute certainty, but there is assurance sufficient for the purposes of human life. —JOHN STUART MILL
a so-called greedy algorithm, which you can also think of as a “myopic algorithm”: one that shortsightedly takes the best thing available every step of the way.
One approach is to augment Hill Climbing with what’s known as “jitter”: if it looks like you’re stuck, mix things up a little.
Another approach is to completely scramble our solution when we reach a local maximum, and start Hill Climbing anew from this random new starting point. This algorithm is known, appropriately enough, as “Random-Restart Hill Climbing”—or, more colorfully, as “Shotgun Hill Climbing.”
instead of turning to full-bore randomness when you’re stuck, use a little bit of randomness every time you make a decision. This technique, developed by the same Los Alamos team that came up with the Monte Carlo Method, is called the Metropolis Algorithm.
The Metropolis Algorithm is like Hill Climbing, trying out different small-scale tweaks on a solution, but with one important difference: at any given point, it will potentially accept bad tweaks as well as good ones.
In physics, what we call “temperature” is really velocity—random motion at the molecular scale.
simulated annealing remains one of the most promising approaches to optimization problems known to the field.
in 1754, Horace Walpole coined the term “serendipity,” based on the fairy tale adventures of The Three Princes of Serendip (Serendip being the archaic name of Sri Lanka), who “were always making discoveries, by accidents and sagacity, of things they were not in quest of.”
Being randomly jittered, thrown out of the frame and focused on a larger scale, provides a way to leave what might be locally good and get back to the pursuit of what might be globally optimal.
from Hill Climbing: even if you’re in the habit of sometimes acting on bad ideas, you should always act on good ones. Second, from the Metropolis Algorithm: your likelihood of following a bad idea should be inversely proportional to how bad an idea it is. Third, from Simulated Annealing: you should front-load randomness, rapidly cooling out of a totally random state, using ever less and less randomness as time goes on, lingering longest as you approach freezing. Temper yourself—literally.
The world’s most difficult word to translate has been identified as “ilunga,” from the Tshiluba language spoken in south-eastern DR Congo.… Ilunga means “a person who is ready to forgive any abuse for the first time, to tolerate it a second time, but never a third time.” —BBC NEWS
Exponential Backoff.
Exponential Backoff isn’t a magic panacea in cases like this, but it does offer a possible way forward.
the heart of TCP congestion control is an algorithm called Additive Increase, Multiplicative Decrease, or AIMD.
Under AIMD, any fully received batch of packets causes the number of packets in flight not to double but merely to increase by 1, and dropped packets cause the transmission rate to cut back by half
characteristic bandwidth shape known as the “TCP sawtooth”—steady upward climbs punctuated by steep drops.
it may be that human communications themselves mirror the very protocols that transmit them: every text message or email reply encourages yet another, while every unreturned message stanches the flow.
The lesson of the TCP sawtooth is that in an unpredictable and changing environment, pushing things to the point of failure is indeed sometimes the best (or the only) way to use all the resources to their fullest. What matters is making sure that the response to failure is both sharp and resilient.
In one illustrative study, a team led by Janet Bavelas at the University of Victoria investigated what would happen when someone listening to a personal story got distracted: not what would happen to the listener’s comprehension, but what would happen to the story. With poor feedback, they discovered, the story falls apart.
Jackson Tolins and Jean Fox Tree demonstrated that those inconspicuous “uh-huhs” and “yeahs” and “hmms” and “ohs” that pepper our speech perform distinct, precise roles in regulating the flow of information from speaker to listener—both its rate and level of detail.
Tail Drop: an unceremonious way of saying that every packet arriving after that point is simply rejected, and effectively deleted.
Now is better than never. Although never is often better than right now. —THE ZEN OF PYTHON