Ian Pitchford’s Kindle Notes & Highlights for Rationality: From AI to Zombies

Because we all want to be seen as rational—and doubting is widely believed to be a virtue of a rationalist. But it is not widely understood that you need a particular reason to doubt, or that an unresolved doubt is a null-op. Instead people think it’s about modesty, a submissive demeanor, maintaining the tribal status hierarchy—almost exactly the same problem as with humility, on which I have previously written. Making a great public display of doubt to convince yourself that you are a rationalist will do around as much good as wearing a lab coat.

29%

To Bayesians, the brain is an engine of accuracy: it processes and concentrates entangled evidence into a map that reflects the territory. The principles of rationality are laws in the same sense as the Second Law of Thermodynamics: obtaining a reliable belief requires a calculable amount of entangled evidence, just as reliably cooling the contents of a refrigerator requires a calculable minimum of free energy.

29%

And yet skillful scientific specialists, even the major innovators of a field, even in this very day and age, do not apply that skepticism successfully. Nobel laureate Robert Aumann, of Aumann’s Agreement Theorem, is an Orthodox Jew: I feel reasonably confident in venturing that Aumann must, at one point or another, have questioned his faith. And yet he did not doubt successfully. We change our minds less often than we think.

29%

Even when it’s explicitly pointed out, some people seemingly cannot follow the leap from the object-level “Use Occam’s Razor! You have to see that your God is an unnecessary belief!” to the meta-level “Try to stop your mind from completing the pattern the usual way!” Because in the same way that all your rationalist friends talk about Occam’s Razor like it’s a good thing, and in the same way that Occam’s Razor leaps right up into your mind, so too, the obvious friend-approved religious response is “God’s ways are mysterious and it is presumptuous to suppose that we can understand them.” So for ...more

30%

Since some of these reproducible differences impact reproducibility—a phenomenon called “selection”—evolution has resulted in organisms suited to reproduction in environments like the ones their ancestors had. Everything about you is built on the echoes of your ancestors’ struggles and victories.

32%

So long as there are limited resources and multiple competing actors capable of passing on characteristics, you have selection pressure. —Perry

35%

You can view both intelligence and natural selection as special cases of optimization: processes that hit, in a large search space, very small targets defined by implicit preferences. Natural selection prefers more efficient replicators. Human intelligences have more complex preferences. Neither evolution nor humans have consistent utility functions, so viewing them as “optimization processes” is understood to be an approximation. You’re trying to get at the sort of work being done, not claim that humans or evolution do this work perfectly.

35%

So animal brains—up until recently—were not major players in the planetary game of optimization; they were pieces but not players. Compared to evolution, brains lacked both generality of optimization power (they could not produce the amazing range of artifacts produced by evolution) and cumulative optimization power (their products did not accumulate complexity over time). For more on this theme see Protein Reinforcement and DNA Consequentialism.

35%

The present state of the art in rationality training is not sufficient to turn an arbitrarily selected mortal into Albert Einstein, which shows the power of a few minor genetic quirks of brain design compared to all the self-help books ever written in the twentieth century.

35%

No matter how common-sensical, no matter how logical, no matter how “obvious” or “right” or “self-evident” or “intelligent” something seems to you, it will not happen inside the ghost. Unless it happens at the end of a chain of cause and effect that began with the instructions that you had to decide on, plus any causal dependencies on sensory data that you built into the starting instructions.

37%

It is now being suggested in several sources that an actual majority of published findings in medicine, though “statistically significant with p < 0.05,” are untrue. But so long as p < 0.05 remains the threshold for publication, why should anyone hold themselves to higher standards, when that requires bigger research grants for larger experimental groups, and decreases the likelihood of getting a publication? Everyone knows that the whole point of science is to publish lots of papers, just as the whole point of a university is to print certain pieces of parchment, and the whole point of a ...more

37%

From a Bayesian perspective, subgoals are epiphenomena of conditional probability functions. There is no expected utility without utility. How silly would it be to think that instrumental value could take on a mathematical life of its own, leaving terminal value in the dust? It’s not sane by decision-theoretical criteria of sanity.

38%

Do you care about threats to your civilization? The worst metathreat to complex civilization is its own complexity, for that complication leads to the loss of many purposes. I look back, and I see that more than anything, my life has been driven by an exceptionally strong abhorrence to lost purposes. I hope it can be transformed to a learnable skill.

38%

The Bayesian definition of evidence favoring a hypothesis is evidence which we are more likely to see if the hypothesis is true than if it is false. Observing that a syllogism is logically valid can never be evidence favoring any empirical proposition, because the syllogism will be logically valid whether that proposition is true or false.

39%

But that’s not the a priori irrational part: The a priori irrational part is where, in the course of the argument, someone pulls out a dictionary and looks up the definition of “atheism” or “religion.” (And yes, it’s just as silly whether an atheist or religionist does it.) How could a dictionary possibly decide whether an empirical cluster of atheists is really substantially different from an empirical cluster of theologians? How can reality vary with the meaning of a word? The points in thingspace don’t move around when we redraw a boundary.

39%

A neural network needs a learning rule. The obvious idea is that when two nodes are often active at the same time, we should strengthen the connection between them—this is one of the first rules ever proposed for training a neural network, known as Hebb’s Rule.

39%

And lo, Network 1 exhibits this behavior even though there’s no explicit node that says whether the object is a blegg or not. The judgment is implicit in the whole network!! Bleggness is an attractor!! which arises as the result of emergent behavior!! from the distributed!! learning rule.

40%

A key idea of the heuristics and biases program is that mistakes are often more revealing of cognition than correct answers. Getting into a heated dispute about whether, if a tree falls in a deserted forest, it makes a sound, is traditionally considered a mistake.

40%

So if you find a blue egg-shaped object that contains palladium and you ask “Is it a blegg?,” the answer depends on what you have to do with the answer. If you ask “Which bin does the object go in?,” then you choose as if the object is a rube. But if you ask “If I turn off the light, will it glow?,” you predict as if the object is a blegg. In one case, the question “Is it a blegg?” stands in for the disguised query, “Which bin does it go in?” In the other case, the question “Is it a blegg?” stands in for the disguised query, “Will it glow in the dark?”

40%

When you look at Network 2, you are seeing from the outside; but the way that neural network structure feels from the inside, if you yourself are a brain running that algorithm, is that even after you know every characteristic of the object, you still find yourself wondering: “But is it a blegg, or not?”

40%

People cling to their intuitions, I think, not so much because they believe their cognitive algorithms are perfectly reliable, but because they can’t see their intuitions as the way their cognitive algorithms happen to look from the inside

40%

The argument starts shifting to focus on definitions. Whenever you feel tempted to say the words “by definition” in an argument that is not literally about pure mathematics, remember that anything which is true “by definition” is true in all possible worlds, and so observing its truth can never constrain which world you live in.

41%

There is an art to using words; even when definitions are not literally true or false, they are often wiser or more foolish. Dictionaries are mere histories of past usage; if you treat them as supreme arbiters of meaning, it binds you to the wisdom of the past, forbidding you to do better.

41%

Here the illusion of inference comes from the labels, which conceal the premises, and pretend to novelty in the conclusion. Replacing labels with definitions reveals the illusion, making visible the tautology’s empirical unhelpfulness. You can never say that Socrates is a [mortal, ¬feathers, biped] until you have observed him to be mortal.

41%

the general skill of blanking a word out of my mind was one I’d practiced for years, albeit with a different purpose.

41%

When you find yourself in philosophical difficulties, the first line of defense is not to define your problematic terms, but to see whether you can think without using those terms at all.

41%

Playing the game of Taboo—being able to describe without using the standard pointer/label/handle—is one of the fundamental rationalist capacities. It occupies the same primordial level as the habit of constantly asking “Why?” or “What does this belief make me anticipate?”

41%

The art is closely related to: Pragmatism, because seeing in this way often gives you a much closer connection to anticipated experience, rather than propositional belief; Reductionism, because seeing in this way often forces you to drop down to a lower level of organization, look at the parts instead of your eye skipping over the whole; Hugging the query, because words often distract you from the question you really want to ask; Avoiding cached thoughts, which will rush in using standard words, so you can block them by tabooing standard words; The writer’s rule of “Show, don’t tell!,” which ...more

41%

Reality is very large—just the part we can see is billions of lightyears across. But your map of reality is written on a few pounds of neurons, folded up to fit inside your skull. I don’t mean to be insulting, but your skull is tiny. Comparatively speaking.

41%

To realize that there are two distinct events, underlying one point on your map, is an essentially scientific challenge—a big, difficult scientific challenge.

41%

Sometimes fallacies of compression result from confusing two known things under the same label—you know about acoustic vibrations, and you know about auditory processing in brains, but you call them both “sound” and so confuse yourself. But the more dangerous fallacy of compression arises from having no idea whatsoever that two distinct entities even exist

41%

Here it is the very act of creating two different buckets that is the stroke of genius insight. ’Tis easier to question one’s facts than one’s ontology.

41%

The obvious modern-day illustration would be words like “intelligence” or “consciousness.” Every now and then one sees a press release claiming that a research study has “explained consciousness” because a team of neurologists investigated a 40Hz electrical rhythm that might have something to do with cross-modality binding of sensory information, or because they investigated the reticular activating system that keeps humans awake. That’s an extreme example, and the usual failures are more subtle, but they are of the same kind.

41%

The part of “consciousness” that people find most interesting is reflectivity, self-awareness, realizing that the person I see in the mirror is “me”; that and the hard problem of subjective experience as distinguished by David Chalmers. We also label “conscious” the state of being awake, rather than asleep, in our daily cycle. But they are all different concepts going under the same name, and the underlying phenomena are different scientific puzzles. You can explain being awake without explaining reflectivity or subjectivity.

42%

We could say that eluctromugnetism is a wrong word, a boundary in thingspace that loops around and swerves through the clusters, a cut that fails to carve reality along its natural joints.

42%

Then the entropy of Y would be 1.75 bits, meaning that we can find out its value by asking 1.75 yes-or-no questions.

42%

The key to creating a good code—a code that transmits messages as compactly as possible—is to reserve short words for things that you’ll need to say frequently, and use longer words for things that you won’t need to say as often.

43%

One may even consider the act of defining a word as a promise to this effect. Telling someone, “I define the word ‘wiggin’ to mean a person with green eyes and black hair,” by Gricean implication, asserts that the word “wiggin” will somehow help you make inferences / shorten your messages.

43%

And the way to carve reality at its joints is to draw your boundaries around concentrations of unusually high probability density in Thingspace

43%

The way to carve reality at its joints, is to draw simple boundaries around concentrations of unusually high probability density in Thingspace.

45%

Why does a mathematical concept generate this strange enthusiasm in its students? What is the so-called Bayesian Revolution now sweeping through the sciences, which claims to subsume even the experimental method itself as a special case? What is the secret that the adherents of Bayes know? What is the light that they have seen?

45%

The original proportion of patients with breast cancer is known as the prior probability

45%

The chance that a patient with breast cancer gets a positive mammography, and the chance that a patient without breast cancer gets a positive mammography, are known as the two conditional probabilities

45%

Collectively, this initial information is kno...

This highlight has been truncated due to consecutive passage length restrictions.

45%

The final answer—the estimated probability that a patient has breast cancer, given that we know she has a positive result on her mammography—is known as the revised probability or the posterior probability. What we’ve just seen is that the...

This highlight has been truncated due to consecutive passage length restrictions.

45%

What this demonstrates is that the mammography result doesn’t replace your old information about the patient’s chance of having cancer; the mammography slides the estimated probability in the direction of the result. A positive result slides the original probability upward; a negative result slides the probability downward. For example, in the original problem where 1% of the women have cancer, 80% of women with cancer get positive mammographies, and 9.6% of women without cancer get positive mammographies, a positive result on the mammography slides the 1% chance upward to 7.8%.

46%

There are also a number of general heuristics about human reasoning that you can learn from looking at Bayes’s Theorem.

46%

A related error is to pay too much attention to P(X|A) and not enough to P(X|¬A) when determining how much evidence X is for A. The degree to which a result X is evidence for A depends not only on the strength of the statement we’d expect to see result X if A were true, but also on the strength of the statement we wouldn’t expect to see result X if A weren’t true.

46%

The Bayesian revolution in the sciences is fueled, not only by more and more cognitive scientists suddenly noticing that mental phenomena have Bayesian structure in them; not only by scientists in every field learning to judge their statistical methods by comparison with the Bayesian method; but also by the idea that science itself is a special case of Bayes’s Theorem; experimental evidence is Bayesian evidence.

46%

The Bayesian revolutionaries hold that when you perform an experiment and get evidence that “confirms” or “disconfirms” your theory, this confirmation and disconfirmation is governed by the Bayesian rules. For example, you have to take into account not only whether your theory predicts the phenomenon, but whether other possible explanations also predict the phenomenon.

See a Problem?

Preview — Rationality by Eliezer Yudkowsky