Karam Elabd’s Kindle Notes & Highlights for Human Compatible: Artificial Intelligence and the Problem of Control

Rate it:

Open Preview

More on this book

Community

Vance

2 notes & 51 highlights

Rob Phippen

1 note & 2 highlights

Sidharth Chandrasekaran

5 notes & 35 highlights

Michael Perkins

1 note & 4 highlights

Kalyan Uppalapati

1 note & 9 highlights

Esteban Riojas

Stevani Lim

Vishnu R

Zardoz

Blaine Morrow

Christopher Cocca

Mario Schlosser

Kindle Notes & Highlights

by Karam Elabd

See all Karam’s Notes & Highlights

Human Compatible: Artificial Intelligence and the Problem of Control

by Stuart Russell

Read between December 4 - December 23, 2022

Like any rational entity, the algorithm learns how to modify the state of its environment—in this case, the user’s mind—in order to maximize its own reward.

The Baldwin effect, as it is now known, can be understood by imagining that evolution has a choice between creating an instinctive organism whose every response is fixed in advance and creating an adaptive organism that learns what actions to take.

Aristotle, among others, studied the notion of successful reasoning—methods of logical deduction that would lead to true conclusions given true premises. He also studied the process of deciding how to act—sometimes called practical reasoning

my plan involves a trade-off between the certainty of success and the cost of ensuring that degree of certainty.

In short, a rational agent acts so as to maximize expected utility.

So, while it would be quite unreasonable to base a theory of beneficial AI on an assumption that humans are rational, it’s quite reasonable to suppose that an adult human has roughly consistent preferences over future lives. That is, if you were somehow able to watch two movies, each describing in sufficient detail and breadth a future life you might lead, such that each constitutes a virtual experience, you could say which you prefer, or express indifference.22

10%

Beyond 2025, we will need to use more exotic physical phenomena—including negative capacitance devices,32 single-atom transistors, graphene nanotubes, and photonics—to keep Moore’s law (or its successor) going.

13%

In the classical period of AI research, before uncertainty became a primary issue in the 1980s, most AI research assumed a world that was fully observable and deterministic, and goals made sense as a way to specify objectives.

14%

There are two kinds of logic that really matter in computer science. The first, called propositional or Boolean logic, was known to the Greeks as well as to ancient Chinese and Indian philosophers. It is the same language of AND gates, NOT gates, and so on that makes up the circuitry of computer chips. In a very literal sense, a modern CPU is just a very large mathematical expression—hundreds of millions of pages—written in the language of propositional logic. The second kind of logic, and the one that McCarthy proposed to use for AI, is called first-order logic.B The language of first-order ...more

15%

It was not until the 1980s, however, that a practical formal language and reasoning algorithms were developed for probabilistic knowledge. This was the language of Bayesian networks,C introduced by Judea Pearl.

15%

prior probability—the initial degree of belief one has in a set of possible hypotheses—becomes a posterior probability as a result of observing some evidence. As more new evidence arrives, the posterior becomes the new prior and the process of Bayesian updating repeats ad infinitum. This process is so fundamental that the modern idea of rationality as maximization of expected utility is sometimes called Bayesian rationality.

18%

In the area of education, the promise of intelligent tutoring systems was recognized even in the 1960s,12 but real progress has been a long time coming. The primary reasons are shortcomings of content and access: most tutoring systems don’t understand the content of what they purport to teach, nor can they engage in two-way communication with their pupils through speech or text. (I imagine myself teaching string theory, which I don’t understand, in Laotian, which I don’t speak.) Recent progress in speech recognition means that automated tutors can, at last, communicate with pupils who are not ...more

22%

We are a very long way from being able to create machine learning systems that are capable of matching or exceeding the capacity for cumulative learning and discovery exhibited by the scientific community—or by ordinary human beings

22%

Nelson Goodman’s Fact, Fiction, and Forecast42—written in 1954 and perhaps one of the most important and underappreciated books on machine learning—suggests a kind of knowledge called an overhypothesis, because it helps to define what the space of reasonable hypotheses might be.

22%

In the philosophy of science, particularly in the early twentieth century, it was not uncommon to see the discovery of new concepts attributed to the three ineffable I’s: intuition, insight, and inspiration. All these were considered resistant to any rational or algorithmic explanation. AI researchers, including Herbert Simon,43 have objected strongly to this view. Put simply, if a machine learning algorithm can search in a space of hypotheses that includes the possibility of adding definitions for new terms not present in the input, then the algorithm can discover new concepts.

22%

As Alfred North Whitehead wrote in 1911, “Civilization advances by extending the number of important operations which we can perform without thinking about them.”

25%

To run a simulation, however, requires a great deal of empirical knowledge of biology, some of which is currently unavailable; so, more model-building experiments would have to be done first. Undoubtedly, these would take time and must be done in the real world.

26%

There are some limits to what AI can provide. The pies of land and raw materials are not infinite, so there cannot be unlimited population growth and not everyone will have a mansion in a private park. (This will eventually necessitate mining elsewhere in the solar system and constructing artificial habitats in space; but I promised not to talk about science fiction.) The pie of pride is also finite: only 1 percent of people can be in the top 1 percent on any given metric. If human happiness requires being in the top 1 percent, then 99 percent of humans are going to be unhappy, even when the ...more

27%

The right to mental security does not appear to be enshrined in the Universal Declaration. Articles 18 and 19 establish the rights of “freedom of thought” and “freedom of opinion and expression.” One’s thoughts and opinions are, of course, partly formed by one’s information environment, which, in turn, is subject to Article 19’s “right to . . . impart information and ideas through any media and regardless of frontiers.” That is, anyone, anywhere in the world, has the right to impart false information to you. And therein lies the difficulty: democratic nations, particularly the United States, ...more

29%

Routine forms of computer programming—the kind that is often outsourced today—are also likely to be automated. Indeed, almost anything that can be outsourced is a good candidate for automation, because outsourcing involves decomposing jobs into tasks that can be parceled up and distributed in a decontextualized form.

30%

Most have already discovered that the idea of retraining everyone as a data scientist or robot engineer is a nonstarter—the world might need five or ten million of these, but nowhere close to the billion or so jobs that are at risk. Data science is a very tiny lifeboat for a giant cruise ship.

30%

We need a radical rethinking of our educational system and our scientific enterprise to focus more attention on the human rather than the physical world. (Joseph Aoun, president of Northeastern University, argues that universities should be teaching and studying “humanics.”33) It sounds odd to say that happiness should be an engineering discipline, but that seems to be the inevitable conclusion. Such a discipline would build on basic science—a better understanding of how human minds work at the cognitive and emotional levels—and would train a wide variety of practitioners, ranging from life ...more

35%

When one first introduces these ideas to a technical audience, one can see the thought bubbles popping out of their heads, beginning with the words “But, but, but . . .” and ending with exclamation marks. The first kind of but takes the form of denial. The deniers say, “But this can’t be a real problem, because XYZ.” Some of the XYZs reflect a reasoning process that might charitably be described as wishful thinking, while others are more substantial. The second kind of but takes the form of deflection: accepting that the problems are real but arguing that we shouldn’t try to solve them, either ...more

36%

is a staple of modern psychology that a single IQ number cannot characterize the full richness of human intelligence.3 There are, the theory says, different dimensions of intelligence: spatial, logical, linguistic, social, and so on.

41%

intended primarily as a guide to AI researchers and developers in thinking about how to create beneficial AI systems; they are not intended as explicit laws for AI systems to follow:4 The machine’s only objective is to maximize the realization of human preferences. The machine is initially uncertain about what those preferences are. The ultimate source of information about human preferences is human behavior.

47%

The first elaboration is to impose a cost for asking Harriet to make decisions or answer questions. (That is, we assume Robbie knows at least this much about Harriet’s preferences: her time is valuable.) In that case, Robbie is less inclined to bother Harriet if he is nearly certain about her preferences; the larger the cost, the more uncertain Robbie has to be before bothering Harriet. This is as it should be. And if Harriet is really grumpy about being interrupted, she shouldn’t be too surprised if Robbie occasionally does things she doesn’t like.

48%

Let’s call this the loophole principle: if a sufficiently intelligent machine has an incentive to bring about some condition, then it is generally going to be impossible for mere humans to write prohibitions on its actions to prevent it from doing so or to prevent it from doing something effectively equivalent.

56%

Neuroscientists are beginning to get a handle on the mechanics of some emotional states and their connections to other cognitive processes,36 and there is some useful work on computational methods for detecting, predicting, and manipulating human emotional states,

See a Problem?

Preview — Human Compatible by Stuart Russell