Is the surface of our planet — and maybe every planet we can get our hands on — going to be carpeted in paper clips (and paper clip factories) by a well-intentioned but misguided artificial intelligence (AI) that ultimately cannibalizes everything in sight, including us, in single-minded pursuit of a seemingly innocuous goal? Nick Bostrom, head of Oxford’s Future of Humanity Institute, thinks that we can’t guarantee it _won’t_ happen, and it worries him. It doesn’t require Skynet and Terminators, it doesn’t require evil geniuses bent on destroying the world, it just requires a powerful AI with a moral system in which humanity’s welfare is irrelevant or defined very differently than most humans today would define it. If the AI has a single goal and is smart enough to outwit our attempts to disable or control it once it has gotten loose, Game Over, argues Professor Bostrom in his book _Superintelligence_.
This is perhaps the most important book I have read this decade, and it has kept me awake at night for weeks. I want to tell you why, and what I think, but a lot of this is difficult ground, so please bear with me. The short form is that I am fairly certain that we _will_ build a true AI, and I respect Vernor Vinge, but I have long been skeptical of the Kurzweilian notions of inevitability, doubly-exponential growth, and the Singularity. I’ve also been skeptical of the idea that AIs will destroy us, either on purpose or by accident. Bostrom’s book has made me think that perhaps I was naive. I still think that, on the whole, his worst-case scenarios are unlikely. However, he argues persuasively that we can’t yet rule out any number of bad outcomes of developing AI, and that we need to be investing much more in figuring out whether developing AI is a good idea. We may need to put a moratorium on research, as was done for a few years with recombinant DNA starting in 1975. We also need to be prepared for the possibility that such a moratorium doesn’t hold. Bostrom also brings up any number of mind-bending dystopias around what qualifies as human, which we’ll get to below.
(snips to my review, since Goodreads limits length)
In case it isn’t obvious by now, both Bostrom and I take it for granted that it’s not only possible but nearly inevitable that we will create a strong AI, in the sense of it being a general, adaptable intelligence. Bostrom skirts the issue of whether it will be conscious, or “have qualia”, as I think the philosophers of mind say.
Where Bostrom and I differ is in the level of plausibility we assign to the idea of a truly exponential explosion in intelligence by AIs, in a takeoff for which Vernor Vinge coined the term “the Singularity.” Vinge is rational, but Ray Kurzweil is the most famous proponent of the Singularity. I read one of Kurzweil’s books a number of years ago, and I found it imbued with a lot of near-mystic hype. He believes the Universe’s purpose is the creation of intelligence, and that that process is growing on a double exponential, starting from stars and rocks through slime molds and humans and on to digital beings.
I’m largely allergic to that kind of hooey. I really don’t see any evidence of the domain-to-domain acceleration that Kurzweil sees, and in particular the shift from biological to digital beings will result in a radical shift in the evolutionary pressures. I see no reason why any sort of “law” should dictate that digital beings will evolve at a rate that *must* be faster than the biological one. I also don’t see that Kurzweil really pays any attention to the physical limits of what will ultimately be possible for computing machines. Exponentials can’t continue forever, as Danny Hillis is fond of pointing out. http://www.kurzweilai.net/ask-ray-the…
So perhaps my opinion is somewhat biased by a dislike of Kurzweil’s circus barker approach, but I think there is more to it than that. Fundamentally, I would put it this way:
Being smart is hard.
And making yourself smarter is also hard. My inclination is that getting smarter is at least as hard as the advantages it brings, so that the difficulty of the problem and the resources that can be brought to bear on it roughly balance. This will result in a much slower takeoff than Kurzweil reckons, in my opinion. Bostrom presents a spectrum of takeoff speeds, from “too fast for us to notice” through “long enough for us to develop international agreements and monitoring institutions,” but he makes it fairly clear that he believes that the probability of a fast takeoff is far too large to ignore. There are parts of his argument I find convincing, and parts I find less so.
To give you a little more insight into why I am a little dubious that the Singularity will happen in what Bostrom would describe as a moderate to fast takeoff, let me talk about the kinds of problems we human beings solve, and that an AI would have to solve. Actually, rather than the kinds of questions, first let me talk about the kinds of answers we would like an AI (or a pet family genius) to generate when given a problem. Off the top of my head, I can think of six:
[Speed] Same quality of answer, just faster. [Ply] Look deeper in number of plies (moves, in chess or go). [Data] Use more, and more up-to-date, data. [Creativity] Something beautiful and new. [Insight] Something new and meaningful, such as a new theory; probably combines elements of all of the above categories. [Values] An answer about (human) values.
The first three are really about how the answers are generated; the last three about what we want to get out of them. I think this set is reasonably complete and somewhat orthogonal, despite those differences.
So what kinds of problems do we apply these styles of answers to? We ultimately want answers that are “better” in some qualitative sense.
Humans are already pretty good at projecting the trajectory of a baseball, but it’s certainly conceivable that a robot batter could be better, by calculating faster and using better data. Such a robot might make for a boring opponent for a human, but it would not be beyond human comprehension.
But if you accidentally knock a bucket of baseballs down a set of stairs, better data and faster computing are unlikely to help you predict the exact order in which the balls will reach the bottom and what happens to the bucket. Someone “smarter” might be able to make some interesting statistical predictions that wouldn’t occur to you or me, but not fill in every detail of every interaction between the balls and stairs. Chaos, in the sense of sensitive dependence on initial conditions, is just too strong.
In chess, go, or shogi, a 1000x improvement in the number of plies that can be investigated gains you maybe only the ability to look ahead two or three moves more than before. Less if your pruning (discarding unpromising paths) is poor, more if it’s good. Don’t get me wrong — that’s a huge deal, any player will tell you. But in this case, humans are already pretty good, when not time limited.
Go players like to talk about how close the top pros are to God, and the possibly apocryphal answer from a top pro was that he would want a three-stone (three-move) handicap, four if his life depended on it. Compared this to the fact that a top pro is still some ten stones stronger than me, a fair amateur, and could beat a rank beginner even if the beginner was given the first forty moves. Top pros could sit across the board from an almost infinitely strong AI and still hold their heads up.
In the most recent human-versus-computer shogi (Japanese chess) series, humans came out on top, though presumably this won’t last much longer.
In chess, as machines got faster, looked more plies ahead, carried around more knowledge, and got better at pruning the tree of possible moves, human opponents were heard to say that they felt the glimmerings of insight or personality from them.
So again we have some problems, at least, where plies will help, and will eventually guarantee a 100% win rate against the best (non-augmented) humans, but they will likely not move beyond what humans can comprehend.
Simply being able to hold more data in your head (or the AI’s head) while making a medical diagnosis using epidemiological data, or cross-correlating drug interactions, for example, will definitely improve our lives, and I can imagine an AI doing this. Again, however, the AI’s capabilities are unlikely to recede into the distance as something we can’t comprehend.
We know that increasing the amount of data you can handle by a factor of a thousand gains you 10x in each dimension for a 3-D model of the atmosphere or ocean, up until chaotic effects begin to take over, and then (as we currently understand it) you can only resort to repeated simulations and statistical measures. The actual calculations done by a climate model long ago reached the point where even a large team of humans couldn’t complete them in a lifetime. But they are not calculations we cannot comprehend, in fact, humans design and debug them.
So for problems with answers in the first three categories, I would argue that being smarter is helpful, but being a *lot* smarter is *hard*. The size of computation grows quickly in many problems, and for many problems we believe that sheer computation is fundamentally limited in how well it can correspond to the real world.
But those are just the warmup. Those are things we already ask computers to do for us, even though they are “dumber” than we are. What about the latter three categories?
I’m no expert in creativity, and I know researchers study it intensively, so I’m going to weasel through by saying it is the ability to generate completely new material, which involves some random process. You also need the ability either to generate that material such that it is aesthetically pleasing with high probability, or to prune those new ideas rapidly using some metric that achieves your goal.
For my purposes here, insight is the ability to be creative not just for esthetic purposes, but in a specific technical or social context, and to validate the ideas. (No implication that artists don’t have insight is intended, this is just a technical distinction between phases of the operation, for my purposes here.) Einstein’s insight for special relativity was that the speed of light is constant. Either he generated many, many hypotheses (possibly unconsciously) and pruned them very rapidly, or his hypothesis generator was capable of generating only a few good ones. In either case, he also had the mathematical chops to prove (or at least analyze effectively) his hypothesis; this analysis likewise involves generating possible paths of proofs through the thicket of possibilities and finding the right one.
So, will someone smarter be able to do this much better? Well, it’s really clear that Einstein (or Feynman or Hawking, if your choice of favorite scientist leans that way) produced and validated hypotheses that the rest of us never could have. It’s less clear to me exactly how *much* smarter than the rest of us he was; did he generate and prune ten times as many hypotheses? A hundred? A million? My guess is it’s closer to the latter than the former. Even generating a single hypothesis that could be said to attack the problem is difficult, and most humans would decline to even try if you asked them to.
Making better devices and systems of any kind requires all of the above capabilities. You must have insight to innovate, and you must be able to quantitatively and qualitatively analyze the new systems, requiring the heavy use of data. As systems get more complex, all of this gets harder. My own favorite example is airplane engines. The Wright Brothers built their own engines for their planes. Today, it takes a team of hundreds to create a jet turbine — thousands, if you reach back into the supporting materials, combustion and fluid flow research. We humans have been able to continue to innovate by building on the work of prior generations, and especially harnessing teams of people in new ways. Unlike Peter Thiel, I don’t believe that our rate of innovation is in any serious danger of some precipitous decline sometime soon, but I do agree that we begin with the low-lying fruit, so that harvesting fruit requires more effort — or new techniques — with each passing generation.
The Singularity argument depends on the notion that the AI would design its own successor, or even modify itself to become smarter. Will we watch AIs gradually pull even with us and then ahead, but not disappear into the distance in a Roadrunner-like flash of dust covering just a few frames of film in our dull-witted comprehension?
Ultimately, this is the question on which continued human existence may depend: If an AI is enough smarter than we are, will it find the process of improving itself to be easy, or will each increment of intelligence be a hard problem for the system of the day? This is what Bostrom calls the “recalcitrance” of the problem.
I believe that the range of possible systems grows rapidly as they get more complex, and that evaluating them gets harder; this is hard to quantify, but each step might involve a thousand times as many options, or evaluating each option might be a thousand times harder. Growth in computational power won’t dramatically overbalance that and give sustained, rapid and accelerating growth that moves AIs beyond our comprehension quickly. (Don’t take these numbers seriously, it’s just an example.)
Bostrom believes that recalcitrance will grow more slowly than the resources the AI can bring to bear on the problem, resulting in continuing, and rapid, exponential increases in intelligence — the arrival of the Singularity. As you can tell from the above, I suspect that the opposite is the case, or that they very roughly balance, but Bostrom argues convincingly. He is forcing me to reconsider.
What about “values”, my sixth type of answer, above? Ah, there’s where it all goes awry. Chapter eight is titled, “Is the default scenario doom?” and it will keep you awake.
What happens when we put an AI in charge of a paper clip factory, and instruct it to make as many paper clips as it can? With such a simple set of instructions, it will do its best to acquire more resources in order to make more paper clips, building new factories in the process. If it’s smart enough, it will even anticipate that we might not like this and attempt to disable it, but it will have the will and means to deflect our feeble strikes against it. Eventually, it will take over every factory on the planet, continuing to produce paper clips until we are buried in them. It may even go on to asteroids and other planets in a single-minded attempt to carpet the Universe in paper clips.
I suppose it goes without saying that Bostrom thinks this would be a bad outcome. Bostrom reasons that AIs ultimately may or may not be similar enough to us that they count as our progeny, but doesn’t hesitate to view them as adversaries, or at least rivals, in the pursuit of resources and even existence. Bostrom clearly roots for humanity here. Which means it’s incumbent on us to find a way to prevent this from happening.
Bostrom thinks that instilling values that are actually close enough to ours that an AI will “see things our way” is nigh impossible. There are just too many ways that the whole process can go wrong. If an AI is given the goal of “maximizing human happiness,” does it count when it decides that the best way to do that is to create the maximum number of digitally emulated human minds, even if that means sacrificing some of the physical humans we already have because the planet’s carrying capacity is higher for digital than organic beings?
As long as we’re talking about digital humans, what about the idea that a super-smart AI might choose to simulate human minds in enough detail that they are conscious, in the process of trying to figure out humanity? Do those recursively digital beings deserve any legal standing? Do they count as human? If their simulations are stopped and destroyed, have they been euthanized, or even murdered? Some of the mind-bending scenarios that come out of this recursion kept me awake nights as I was reading the book.
He uses a variety of names for different strategies for containing AIs, including “genies” and “oracles”. The most carefully circumscribed ones are only allowed to answer questions, maybe even “yes/no” questions, and have no other means of communicating with the outside world. Given that Bostrom attributes nearly infinite brainpower to an AI, it is hard to effectively rule out that an AI could still find some way to manipulate us into doing its will. If the AI’s ability to probe the state of the world is likewise limited, Bsotrom argues that it can still turn even single-bit probes of its environment into a coherent picture. It can then decide to get loose and take over the world, and identify security flaws in outside systems that would allow it to do so even with its very limited ability to act.
I think this unlikely. Imagine we set up a system to monitor the AI that alerts us immediately when the AI begins the equivalent of a port scan, for whatever its interaction mechanism is. How could it possibly know of the existence and avoid triggering the alert? Bostrom has gone off the deep end in allowing an intelligence to infer facts about the world even when its data is very limited. Sherlock Holmes always turns out to be right, but that’s fiction; in reality, many, many hypotheses would suit the extremely slim amount of data he has. The same will be true with carefully boxed AIs.
At this point, Bostrom has argued that containing a nearly infinitely powerful intelligence is nearly impossible. That seems to me to be effectively tautological.
If we can’t contain them, what options do we have? After arguing earlier that we can’t give AIs our own values (and presenting mind-bending scenarios for what those values might actually mean in a Universe with digital beings), he then turns around and invests a whole string of chapters in describing how we might actually go about building systems that have those values from the beginning.
At this point, Bostrom began to lose me. Beyond the systems for giving AIs values, I felt he went off the rails in describing human behavior in simplistic terms. We are incapable of balancing our desire to reproduce with a view of the tragedy of the commons, and are inevitably doomed to live out our lives in a rude, resource-constrained existence. There were some interesting bits in the taxonomies of options, but the last third of the book felt very speculative, even more so than the earlier parts.
Bostrom is rational and seems to have thought carefully about the mechanisms by which AIs may actually arise. Here, I largely agree with him. I think his faster scenarios of development, though, are unlikely: being smart, and getting smarter, is hard. He thinks a “singleton”, a single, most powerful AI, is the nearly inevitable outcome. I think populations of AIs are more likely, but if anything this appears to make some problems worse. I also think his scenarios for controlling AIs are handicapped in their realism by the nearly infinite powers he assigns them. In either case, Bostrom has convinced me that once an AI is developed, there are many ways it can go wrong, to the detriment and possibly extermination of humanity. Both he and I are opposed to this. I’m not ready to declare a moratorium on AI research, but there are many disturbing possibilities and many difficult moral questions that need to be answered.
The first step in answering them, of course, is to begin discussing them in a rational fashion, while there is still time. Read the first 8 chapters of this book!
Read more here: