Vishnu R’s Kindle Notes & Highlights for Human Compatible: Artificial Intelligence and the Problem of Control

Rate it:

Open Preview

More on this book

Community

Vance

2 notes & 51 highlights

Rob Phippen

1 note & 2 highlights

Sidharth Chandrasekaran

5 notes & 35 highlights

Michael Perkins

1 note & 4 highlights

Kalyan Uppalapati

1 note & 9 highlights

Esteban Riojas

Stevani Lim

Zardoz

Karam Elabd

Blaine Morrow

Christopher Cocca

Mario Schlosser

Kindle Notes & Highlights

by Vishnu R

See all Vishnu’s Notes & Highlights

Human Compatible: Artificial Intelligence and the Problem of Control

by Stuart Russell

Read between March 19 - April 9, 2023

Inevitably, these machines will be uncertain about our objectives—after all, we are uncertain about them ourselves—but it turns out that this is a feature, not a bug (that is, a good thing and not a bad thing). Uncertainty about objectives implies that machines will necessarily defer to humans: they will ask permission, they will accept correction, and they will allow themselves to be switched off.

every step towards an explanation of how the mind works is also a step towards the creation of the mind’s capabilities in an artifact—that is, a step towards artificial intelligence.

All those Hollywood plots about machines mysteriously becoming conscious and hating humans are really missing the point: it’s competence, not consciousness, that matters.

Bernoulli’s introduction of utility—an invisible property—to explain human behavior via a mathematical theory was an utterly remarkable proposal for its time. It was all the more remarkable for the fact that, unlike monetary amounts, the utility values of various bets and prizes are not directly observable; instead, utilities are to be inferred from the preferences exhibited by an individual.

A great deal of our cognitive structure is there to compensate for the mismatch between our small, slow brains and the incomprehensibly huge complexity of the decision problem that we face all the time.

11%

Complexity means that the real-world decision problem—the problem of deciding what to do right now, at every instant in one’s life—is so difficult that neither humans nor computers will ever come close to finding perfect solutions.

22%

Alfred North Whitehead wrote in 1911, “Civilization advances by extending the number of important operations which we can perform without thinking about them.”45

23%

A system that can both discover new high-level actions—as described earlier—and manage its computational activity to focus on units of computation that quickly deliver significant improvements in decision quality would be a formidable decision maker in the real world.

24%

to determine whether a specific drug cures a certain kind of cancer in an experimental animal, a scientist—human or machine—has two choices: inject the animal with the drug and wait several weeks or run a sufficiently accurate simulation. To run a simulation, however, requires a great deal of empirical knowledge of biology, some of which is currently unavailable; so, more model-building experiments would have to be done first. Undoubtedly, these would take time and must be done in the real world.

25%

A final limitation of machines is that they are not human. This puts them at an intrinsic disadvantage when trying to model and predict one particular class of objects: humans. Our brains are all quite similar, so we can use them to simulate—to experience, if you will—the mental and emotional lives of others. This, for us, comes for free.

25%

it’s reasonable to suppose that acquiring a human-level or superhuman understanding of humans will take them longer than most other capabilities.

29%

the direct effects of technology work both ways: at first, by increasing productivity, technology can increase employment by reducing the price of an activity and thereby increasing demand; subsequently, further increases in technology mean that fewer and fewer humans are required.

29%

Historically, most mainstream economists have argued from the “big picture” view: automation increases productivity, so, as a whole, humans are better off, in the sense that we enjoy more goods and services for the same amount of work. Economic theory does not, unfortunately, predict that each human will be better off as a result of automation. Generally, automation increases the share of income going to capital (the owners of the housepainting robots) and decreases the share going to labor (the ex-housepainters).

34%

It is important to understand that self-preservation doesn’t have to be any sort of built-in instinct or prime directive in machines. (So Isaac Asimov’s Third Law of Robotics,8 which begins “A robot must protect its own existence,” is completely unnecessary.) There is no need to build self-preservation in because it is an instrumental goal—a goal that is a useful subgoal of almost any original objective.9 Any entity that has a definite objective will automatically act as if it also has instrumental goals.

35%

If an intelligence explosion does occur, and if we have not already solved the problem of controlling machines with only slightly superhuman intelligence—for example, if we cannot prevent them from making these recursive self-improvements—then we would have no time left to solve the control problem and the game would be over. This is Bostrom’s hard takeoff scenario, in which the machine’s intelligence increases astronomically in just days or weeks. In Turing’s words, it is “certainly something which can give us anxiety.”

36%

a long-term risk can still be cause for immediate concern. The right time to worry about a potentially serious problem for humanity depends not just on when the problem will occur but also on how long it will take to prepare and implement a solution.

38%

To varying degrees, all the major technological issues of the twentieth century—nuclear power, genetically modified organisms (GMOs), and fossil fuels—succumbed to tribalism. On each issue, there are two sides, pro and anti. The dynamics and outcomes of each have been different, but the symptoms of tribalism are similar: mutual distrust and denigration, irrational arguments, and a refusal to concede any (reasonable) point that might favor the other tribe. On the pro-technology side, one sees denial and concealment of risks combined with accusations of Luddism; on the anti side, one sees a ...more

39%

the AI debate is in danger of becoming tribal, of creating pro-AI and anti-AI camps. This would be damaging to the field because it’s simply not true that being concerned about the risks inherent in advanced AI is an anti-AI stance.

43%

The first reason for optimism is that there are strong economic incentives to develop AI systems that defer to humans and gradually align themselves to user preferences and intentions. Such systems will be highly desirable: the range of behaviors they can exhibit is simply far greater than that of machines with fixed, known objectives. They will ask humans questions or ask for permission when appropriate; they will do “trial runs” to see if we like what they propose to do; they will accept correction when they do something wrong.

51%

The reason we have moral philosophy is that there is more than one person on Earth. The approach that is most relevant for understanding how AI systems should be designed is often called consequentialism: the idea that choices should be judged according to expected consequences.

See a Problem?

Preview — Human Compatible by Stuart Russell