Superintelligence: Paths, Dangers, Strategies
Rate it:
Open Preview
Kindle Notes & Highlights
Read between January 21 - March 14, 2019
31%
Flag icon
A careful evaluation of seed AI in a sandbox environment, showing that it is behaving cooperatively and showing good judgment. After some further adjustments, the test results are as good as they could be. It is a green light for the final step
31%
Flag icon
when dumb, smarter is safer; yet when smart, smarter is more dangerous.
Stone
O.wow
31%
Flag icon
The treacherous turn—While weak, an AI behaves cooperatively (increasingly so, as it gets smarter). When the AI gets sufficiently strong—without warning or provocation—it strikes, forms a singleton, and begins directly to optimize the world according to the criteria implied by its final values.
31%
Flag icon
an AI might not play nice in order that it be allowed to survive and prosper. Instead, the AI might calculate that if it is terminated, the programmers who built it will develop a new and somewhat different AI architecture, but one that will be given a similar utility function.
Stone
O
31%
Flag icon
the original AI may be indifferent to its own demise, knowing that its goals will continue to be pursued in the future. It might even choose a strategy in which it malfunctions in some particularly interesting or reassuring
Stone
Deliberate malfunction
31%
Flag icon
If the AI already has a decisive strategic advantage, then any attempt to stop it will fail. If the AI does not yet have a decisive strategic advantage, then the AI might temporarily conceal its canny new idea for how to instantiate its final goal until it has grown strong enough that the sponsor and everybody else will be unable to resist. In either case, we get a treacherous turn.
Stone
Main. strategy. but how can you tell whether it's friendly or not. unless the final moment comes. just like the assumption of simulation. innocent until proven
32%
Flag icon
infrastructure profusion, a phenomenon where an agent transforms large parts of the reachable universe into infrastructure in the service of some goal, with the side effect of preventing the realization of humanity’s axiological potential.
Stone
Why cant there be two or three or more goals, limiting each other
33%
Flag icon
It could, for instance, count the paperclips it has made, to reduce the risk that it has made too few. After it has counted them, it could count them again. It could inspect each one, over and over, to reduce the risk that any of the paperclips fail to meet the design specifications.
Stone
Why is it that when it talks about motivations it is always multi layered. But when it is final goal it is always singular
34%
Flag icon
We can divide potential control methods into two broad classes: capability control methods, which aim to control what the superintelligence can do; and motivation selection methods, which aim to control what it wants to do.
Stone
Control
36%
Flag icon
Any piece of information can in principle be relevant to any topic whatsoever, depending on the background information of a reasoner.
Stone
Like a web.wow
36%
Flag icon
It might imagine the consequences of different possible laws of physics: what kind of planets would form, what kind of intelligent life would evolve, what kind of societies would develop, what kind of methods to solve the control problem would be attempted, how those methods could be defeated.
Stone
O.wow
36%
Flag icon
if an AI has been designed in such a way that it is supposed not to want to access the internet, a fake Ethernet port could be installed (leading to an automatic shutdown switch) just to see if the AI tries to use it.
Stone
Tripwire
36%
Flag icon
Is a sadist harmed if he is prevented from tormenting his victim?
Stone
O.point well made
41%
Flag icon
If one is interested in the outcome of singleton scenarios, therefore, one really only has three sources of information: information about matters that cannot be affected by the actions of the singleton (such as the laws of physics); information about convergent instrumental values; and information that enables one to predict or speculate about what final values the singleton will have.
42%
Flag icon
Although suggestive, this analogy is, however, inexact, since there is still no complete functional substitute for horses. If there were inexpensive mechanical devices that ran on hay and had exactly the same shape, feel, smell, and behavior as biological horses—perhaps even the same conscious experiences—then demand for biological horses would probably decline further.
Stone
Horse
42%
Flag icon
world GDP would soar following an intelligence explosion (because of massive amounts of new labor-substituting machines but also because of technological advances achieved by superintelligence, and, later, acquisition of vast amounts of new land through space colonization), it follows that the total income from capital would increase enormously. If humans remain the owners of this capital, the total income received by the human population would grow astronomically, despite the fact that in this scenario humans would no longer receive any wage income.
Stone
Economy
43%
Flag icon
the fate of the humans, who may be supported by savings, subsidies, or wage income deriving from other humans who prefer to hire humans.
Stone
Human fate
43%
Flag icon
Bringing a new biological human worker into the world takes anywhere between fifteen and thirty years, depending on how much expertise and experience is required. During this time the new person must be fed, housed, nurtured, and educated—at great expense. By contrast, spawning a new copy of a digital worker is as easy as loading a new program into working memory. Life thus becomes cheap. A business could continuously adapt its workforce to fit demands by spawning new copies—and terminating copies that are no longer needed, to free up computer resources. This could lead to an extremely high ...more
Stone
Span.cost and efficiency
44%
Flag icon
the era of human-like emulations would be brief—a very brief interlude in sidereal time—and that it would soon give way to an era of greatly superior artificial intelligence.
44%
Flag icon
since such stuff as virtual reality is made of can be fairly cheap, emulations may work in sumptuous surroundings—in splendid mountaintop palaces, on terraces set in a budding spring forest, or on the beaches of an azure lagoon—with just the right illumination, temperature, scenery and décor; free from annoying fumes, noises, drafts, and buzzing insects; dressed in comfortable clothing, feeling clean and focused, and well nourished.
Stone
Environment. future work place. goals
45%
Flag icon
We could thus imagine, as an extreme case, a technologically highly advanced society, containing many complex structures, some of them far more intricate and intelligent than anything that exists on the planet today—a society which nevertheless lacks any type of being that is conscious or whose welfare has moral significance. In a sense, this would be an uninhabited society. It would be a society of economic miracles and technological awesomeness, with nobody there to benefit. A Disneyland without children.
45%
Flag icon
we do now indulge in music, humor, romance, art, etc. If these behaviors are really so “wasteful,” then how come they have been tolerated and indeed promoted by the evolutionary processes that shaped our species?
Stone
O.wow
45%
Flag icon
Many of the behaviors in question are not even unique to Homo sapiens. Flamboyant display is found in a wide variety of contexts, from sexual selection in the animal kingdom to prestige contests among nation states.
Stone
wow
45%
Flag icon
Play, for example, which occurs only in some species and predominantly among juveniles, is mainly a way for the young animal to learn skills that it will need later in life. When emulations can be created as adults, already in possession of a mature repertoire of skills, or when knowledge and techniques acquired by one AI can be directly ported into another AI, the need for playful behavior might become less widespread.
Stone
O.wow
45%
Flag icon
first, that many of the costly displays we find in nature are linked to sexual selection.32 Reproduction among technologically mature life forms, in contrast, may be predominantly or exclusively asexual. Second, technologically advanced agents might have available new means of reliably communicating information about themselves, means that do not rely on costly display.
Stone
O.wow
45%
Flag icon
Third, not all possible costly displays are intrinsically valuable or socially desirable. Many are simply wasteful.
Stone
O.wow
45%
Flag icon
While activities like music and humor could plausibly be claimed to enhance the intrinsic quality of human life, it is doubtful that a similar claim could be sustained with regard to the costly pursuit of fashion accessories and other consumerist status symbols. Worse, costly display can be outright harmful, as in macho posturing leading to gang violence or military bravado.
Stone
O
48%
Flag icon
If such a term is to be used, it must first be defined. It is not enough to define it in terms of other high-level human concepts—“happiness is enjoyment of the potentialities inherent in our human nature” or some such philosophical paraphrase. The definition must bottom out in terms that appear in the AI’s programming language, and ultimately in primitives such as mathematical operators and addresses pointing to the contents of individual memory registers.
Stone
AI interpretation of happiness
48%
Flag icon
we cannot transfer human values into an AI by typing out full-blown representations in computer code, what else might we try? This chapter discusses several alternative paths. Some of these may look plausible at first sight—but much less so upon closer examination. Future explorations should focus on those paths that remain open.
Stone
Main
48%
Flag icon
Now one might wonder: if the value-loading problem is so tricky, how do we ourselves manage to acquire our values?
Stone
O.wow
48%
Flag icon
We begin life with some relatively simple starting preferences (e.g. an aversion to noxious stimuli) together with a set of dispositions to acquire additional preferences in response to various possible experiences (e.g. we might be disposed to form a preference for objects and behaviors that we find to be valued and rewarded in our culture). Both the simple starting preferences and the dispositions are innate, having been shaped by natural and sexual selection over evolutionary timescales. Yet which preferences we end up with as adults depends on life events. Much of the information content ...more
48%
Flag icon
many of us love another person and thus place great final value on his or her well-being. What is required to represent such a value? Many elements are involved, but consider just two: a representation of “person” and a representation of “well-being.”
53%
Flag icon
It is also reflected in the marked changes that the distribution of moral belief has undergone over time, many of which we like to think of as progress. In medieval Europe, for instance, it was deemed respectable entertainment to watch a political prisoner being tortured to death. Cat-burning remained popular in sixteenth-century Paris.
Stone
O
54%
Flag icon
Very likely, we are still laboring under one or more grave moral misconceptions. In such circumstances to select a final value based on our current convictions, in a way that locks it in forever and precludes any possibility of further ethical progress, would be to risk an existential moral calamity.
Stone
Wow.main
54%
Flag icon
Even if we could be rationally confident that we have identified the correct ethical theory—which we cannot be—we would still remain at risk of making mistakes in developing important details of this theory. Seemingly simple moral theories can have a lot of hidden complexity.
Stone
Main
54%
Flag icon
Another objection is that there are so many different ways of life and moral codes in the world that it might not be possible to “blend” them into one CEV. Even if one could blend them, the result might not be particularly appetizing—one would be unlikely to get a delicious meal by mixing together all the best flavors from everyone’s different favorite dish.
Stone
O
54%
Flag icon
To continue the cooking analogy, it might be that individuals or cultures will have different favorite dishes, but that they can nevertheless broadly agree that aliments should be nontoxic.
55%
Flag icon
By setting up a dynamic that implements humanity’s coherent extrapolated volition—as opposed to their own volition, or their own favorite moral theory—they in effect distribute their influence over the future to all of humanity.
55%
Flag icon
One parameter is the extrapolation base: Whose volitions are to be included? We might say “everybody,” but this answer spawns a host of further questions. Does the extrapolation base include so-called “marginal persons” such as embryos, fetuses, brain-dead persons, patients with severe dementias or who are in permanent vegetative states? Does each of the hemispheres of a “split-brain” patient get its own weight in the extrapolation and is this weight the same as that of the entire brain of a normal subject? What about people who lived in the past but are now dead? People who will be born in ...more
Stone
Powerful question.wow
55%
Flag icon
instead of implementing humanity’s coherent extrapolated volition, one could try to build an AI with the goal of doing what is morally right, relying on the AI’s superior cognitive capacities to figure out just which actions fit that description. We can call this proposal “moral rightness” (MR).
58%
Flag icon
Scientists and their public advocates often say that it is futile to try to control the evolution of technology by blocking research. If some technology is feasible (the argument goes) it will be developed regardless of any particular policymaker’s scruples about speculative future risks. Indeed, the more powerful the capabilities that a line of development promises to produce, the surer we can be that somebody, somewhere, will be motivated to pursue it. Funding cuts will not stop progress or forestall its concomitant dangers.
Stone
Funding and advance in tech
58%
Flag icon
Interestingly, this futility objection is almost never raised when a policymaker proposes to increase funding to some area of research, even though the argument would seem to cut both ways. One rarely hears indignant voices protest: “Please do not increase our funding. Rather, make some cuts. Researchers in other countries will surely pick up the slack; the same work will get done anyway. Don’t squander the public’s treasure on domestic scientific research!”
Stone
Funding and tech
58%
Flag icon
Even somebody who is largely altruistic might then choose to develop the overall harmful technology. They might reason that the harm H will result no matter what they do, since if they refrain somebody else will develop the technology anyway; and given that total welfare cannot be affected, they might as well grab the benefit B for themselves and their nation. (“Unfortunately, there will soon be a device that will destroy the world. Fortunately, we got the grant to build it!”)
Stone
Wow
60%
Flag icon
For these reasons, the amount of time that will elapse before the intelligence explosion may not matter much per se. Perhaps what matters, instead, is (a) the amount of intellectual progress on the control problem achieved by the time of the detonation; and (b) the amount of skill and intelligence available at the time to implement the best available solutions (and to improvise what is missing).
Stone
Main.exactly
61%
Flag icon
Any abstract point about “what should be done” must be embodied in the form of a concrete message, which is entered into the arena of rhetorical and political reality. There it will be ignored, misunderstood, distorted, or appropriated for various conflicting purposes; it will bounce around like a pinball, causing actions and reactions, ushering in a cascade of consequences, the upshot of which need bear no straightforward relationship to the intentions of the original sender.
Stone
Politics
61%
Flag icon
A related type of argument is that we ought—rather callously—to welcome small and medium-scale catastrophes on grounds that they make us aware of our vulnerabilities and spur us into taking precautions that reduce the probability of an existential catastrophe. The idea is that a small or medium-scale catastrophe acts like an inoculation, challenging civilization with a relatively survivable form of a threat and stimulating an immune response that readies the world to deal with the existential variety of the threat.
Stone
Trial
1 3 Next »