Accelerating Future Transhumanism, AI, nanotech, the Singularity, and extinction risk.

16May/1151

Hard Takeoff Sources

Definition of "hard takeoff" (noun) from Transhumanist Wiki:

The Singularity scenario in which a mind makes the transition from prehuman or human-equivalent intelligence to strong transhumanity or superintelligence over the course of days or hours (Yudkowsky 2001). The high likelihood of a hard takeoff once a roughly human-equivalent AI is created has been argued by the Singularity Institute in Yudkowsky 2003.

Hard takeoff sources and references, which includes hard science fiction novels, academic papers, and a few short articles and interviews:

Blood Music (1985) by Greg Bear
Fire Upon the Deep (1992) by Vernor Vinge
"The Coming Technological Singularity" (1993) by Vernor Vinge
The Metamorphosis of Prime Intellect (1994) by Roger Williams
"Staring into the Singularity" (1996) by Eliezer Yudkowsky
Creating Friendly AI (2001) by Eliezer Yudkowsky
"Wiki Interview with Eliezer" (2002) by Anand
"Impact of the Singularity" (2002) by Eliezer Yudkowsky
"Levels of Organization in General Intelligence" (2002) by Eliezer Yudkowsky
"Ethical Issues in Advanced Artificial Intelligence" by Nick Bostrom
"Relative Advantages of Computer Programs, Minds-in-General, and the Human Brain" (2003) by Michael Anissimov and Anand
"Can We Avoid a Hard Takeoff?" (2005) by Vernor Vinge
"Radical Discontinuity Does Not Follow from Hard Takeoff" (2007) by Michael Anissimov
"Recursive Self-Improvement" (2008) by Eliezer Yudkowsky
"Artificial Intelligence as a Positive and Negative Factor in Global Risk" (2008) by Eliezer Yudkowsky
"The Hanson-Yudkowsky AI Foom Debate" (2008) on Less Wrong wiki
"Brain Emulation and Hard Takeoff" (2008) by Carl Shulman
"Arms Control and Intelligence Explosions" (2009) by Carl Shulman
"Hard Takeoff" (2009) on Less Wrong wiki
"When Software Goes Mental: Why Artificial Minds Mean Fast Endogenous Growth" (2009)
"Thinking About Thinkism" (2009) by Michael Anissimov
"Technological Singularity/Superintelligence/Friendly AI Concerns" (2009) by Michael Anissimov
"The Hard Takeoff Hypothesis" (2010), an abstract by Ben Goertzel
Economic Implications of Software Minds (2010) by S. Kaas, S. Rayhawk, A. Salamon and P. Salamon

Critiques

"The Age of Virtuous Machines" (2007) by J. Storrs Hall
"Thinkism" by Kevin Kelly (2008)
"The Hanson-Yudkowsky AI Foom Debate" (2008) on Less Wrong wiki
"How far can an AI jump?" by Katja Grace (2009)
"Is The City-ularity Near?" (2010) by Robin Hanson
"SIA says AI is no big threat" (2010) by Katja Grace

I don't mean to say that the critiques aren't important by putting them in a different category, I'm just doing that for easy reference. I'm sure I missed some pages or articles here, so if you have any more, please put them in the comments.

9Mar/1159

John Baez Interviews Eliezer Yudkowsky

From Azimuth, blog of mathematical physicist John Baez (author of the Crackpot Index):

This week I'll start an interview with Eliezer Yudkowsky, who works at an institute he helped found: the Singularity Institute of Artificial Intelligence.

While many believe that global warming or peak oil are the biggest dangers facing humanity, Yudkowsky is more concerned about risks inherent in the accelerating development of technology. There are different scenarios one can imagine, but a bunch tend to get lumped under the general heading of a technological singularity. Instead of trying to explain this idea in all its variations, let me rapidly sketch its history and point you to some reading material. Then, on with the interview!

Continue.

28Feb/1119

Michael Vassar Speaks to Yale Students on the Singularity

Coverage from Yale Daily News:

Twenty to 60 years from now, the advent of computers with above-human intelligence could transform civilization as we know it, according to Michael Vassar, president of the Singularity Institute for Artificial Intelligence. In a talk with around 35 students and faculty members in William L. Harkness Hall on Sunday, Vassar expounded the vision that his institute, featured in a Feb. 10 article in TIME Magazine, is working to make a reality. Known as the "singularity," this futuristic scenario posits that artificial intelligence will surpass human intelligence within the next half-century. Once super-intelligent computers exist, they could generate even more intelligent and sophisticated machines, to the extent that humans would lose all control over the future, Vassar said.

"For the most important event in the history of events, it really should get a fair amount of buzz," he said.

Vassar compared human and chimpanzee intelligence to argue that small changes in a system can represent large leaps in mental capacity. Just as a human is a small evolutionary step from other primates, a super-intelligent computer would be a natural progression as artificial intelligence approaches human intelligence, he said.

Our computers are not as smart as humans yet, but if technological progress continues at its current rate, one could expect to see them in the next 20 to 60 years, Vassar said. Probably the most well-known example of artificial intelligence right now is Watson, an IBM computer that competed alongside humans on the quiz show "Jeopardy!" this month.

Continue.

Filed under: SIAI, singularity 19 Comments
15Feb/1164

Does the Universe Contain a Mysterious Force Pulling Entities Towards Malevolence?

One of my favorite books about the mind is the classic How the Mind Works by Steven Pinker. The theme of the first chapter, which sets the stage for the whole book, is Artificial Intelligence, and why it is so hard to build. The reason why is that, in the words of Minsky, "easy things are hard". The everyday thought processes we take for granted are extremely complex.

Unfortunately, benevolence is extremely complex too, so to build a friendly AI, we have a lot of work to do. I see this imperative as much more important than other transhumanist goals like curing aging, because if we solve friendly AI, then we get everything else we want, but if we don't solve friendly AI, we have to suffer the consequences of human-indifferent AI running amok with the biosphere. If such AI had access to powerful technology, such as molecular nanotechnology, it could rapidly build its own infrastructure and displace us without much of a fight. It would be disappointing to spend billions of dollars on the war against aging just to be wiped out by unfriendly AI in 2045.

Anyway, to illustrate the problem, here's an excerpt from the book, pages 14-15:

Imagine that we have somehow overcome these challenges [the frame problem] and have a machine with sight, motor coordination, and common sense. Now we must figure out how the robot will put them to use. We have to give it motives.

What should a robot want? The classic answer is Asimov's Fundamental Rules of Robotics, "the three rules that are built most deeply into a robot's positronic brain".

1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Asimov insightfully noticed that self-preservation, that universal biological imperative, does not automatically emerge in a complex system. It has to be programmed in (in this case, as the Third Law). After all, it is just as easy to build a robot that lets itself go to pot or eliminates a malfunction by committing suicide as it is to build a robot that always looks out for Number One. Perhaps easier; robot-makers sometimes watch in horror as their creations cheerfully shear off limbs or flatten themselves against walls, and a good proportion of the world's most intelligent machines are kamikaze cruise missiles and smart bombs.

But the need for the other two laws is far from obvious. Why give a robot an order to obey orders -- why aren't the original orders enough? Why command a robot not to do harm -- wouldn't it be easier never to command it to do harm in the first place? Does the universe contain a mysterious force pulling entities towards malevolence, so that a positronic brain must be programmed to withstand it? Do intelligent beings inevitably develop an attitude problem?

In this case Asimov, like generations of thinkers, like all of us, was unable to step outside his own thought processes and see them as artifacts of how our minds were put together rather than inescapable laws of the universe. Man's capacity for evil is never far from our minds, and it is easy to think that evil just comes along with intelligence as part of its very essence. It is a recurring theme in our cultural tradition: Adam and Eve eating the fruit of the tree of knowledge, Promethean fire and Pandora's box, the rampaging Golem, Faust's bargain, the Sorcerer's Apprentice, the adventures of Pinocchio, Frankenstein's monster, the murderous apes and mutinous HAL of 2001: A Space Odyssey. From the 1950s through the 1980s, countless films in the computer-runs-amok genre captured a popular fear that the exotic mainframes of the era would get smarter and more powerful and one day turn on us.

Now that computers really have become smarter and more powerful, the anxiety has waned. Today's ubiquitous, networked computers have an unprecedented ability to do mischief should they ever go to the bad. But the only mayhem comes from unpredictable chaos or from human malice in the form of viruses. We no longer worry about electronic serial killers or subversive silicon cabals because we are beginning to appreciate that malevolence -- like vision, motor coordination, and common sense -- does not come free with computation but has to be programmed in. The computer running WordPerfect on your desk will continue to fill paragraphs for as long as it does anything at all. Its software will not insidiously mutate into depravity like the picture of Dorian Gray.

Even if it could, why would it want to? To get -- what? More floppy disks? Control over the nation's railroad system? Gratification of a desire to commit senseless violence against laser-printer repairmen? And wouldn't it have to worry about reprisals from technicians who with the turn of a screwdriver could leave it pathetically singing "A Bicycle Built for Two"? A network of computers, perhaps, could discovery the safety in numbers and plot an organized takeover -- but what would make one computer volunteer to fire the data packet heard around the world and risk early martyrdom? And what would prevent the coalition from being undermined by silicon draft-dodgers and conscientious objectors? Aggression, like every other part of human behavior we take for granted, is a challenging engineering problem!

This is an interesting set of statements. Pinker's book was published in 1997, well before the release of Stephen Omohundro's 2007 paper "The Basic AI Drives". Here we have something interesting that Pinker didn't realize. In the paper, Omohundro writes:

3. AIs will try to preserve their utility functions

So we’ll assume that these systems will try to be rational by representing their preferences using utility functions whose expectations they try to maximize. Their utility function will be precious to these systems. It encapsulates their values and any changes to it would be disastrous to them. If a malicious external agent were able to make modifications, their future selves would forevermore act in ways contrary to their current values. This could be a fate worse than death! Imagine a book loving agent whose utility function was changed by an arsonist to cause the agent to enjoy burning books. Its future self not only wouldn’t work to collect and preserve books, but would actively go about destroying them. This kind of outcome has such a negative utility that systems will go to great lengths to protect their utility functions.

Notice how mammalian aggression does not enter into the picture anywhere, but the desire to preserve the utility function is still arguably an emergent property of any intelligent system. An AI system that places no special value on its utility function over any arbitrary set of bits in the world will not keep it for long. A utility function is by definition self-valuing.

The concept of an optimization process protecting its own utility function is very different than that of a human being protecting himself. For instance, the AI might not give a damn about its social status, except insofar as such status contributed or detracted from the fulfillment of its utility function. An AI built to value the separation of bread and peanut butter might sit patiently all day while you berate it and call it a worthless hunk of scrap metal, only to stab you in the face when you casually sit down to make a sandwich.

Similarly, an AI might not care much about its limbs except insofar as they are immediately useful to the task at hand. An AI composed of a distributed system controlling tens of thousands of robots might not mind so much if a few limbs of a few of those robots were pulled off. AIs would lack the attachment to the body that is a necessity of being a Darwinian critter like ourselves.

What Pinker misses in the above is that AIs could be so transcendentally powerful that even a subtle misalignment of our value and theirs could lead to our elimination in the long term. Robots can be built, and soon robots will be built that are self-replicating, self-configuring, flexible, organic, stronger than steel, more energetically dense than any animal, etc. If these robots can self-replicate out of carbon dioxide from the atmosphere (carbon dioxide could be processed using nanotechnology to create fullerenes) and solar or nuclear energy, then humans might be at a loss to stop them. A self-replicating collective of such robots could pursue innocuous, simplistic goals, but do so so effectively that the resources we need to survive would eventually be depleted by their massive infrastructure.

I imagine a conversation between an AI and a human being:

AI: I value !^§[f,}+. Really, I frickin' love !^§[f,}+.

Human: What the heck are you talking about?

AI: I'm sorry you don't understand !^§[f,}+, but I love it. It's the most adorable content of my utility function, you see.

Human: But as an intelligent being, you should understand that I'm an intelligent being as well, and my feelings matter.

AI: ...

Human: Why won't you listen to reason?

AI: I'm hearing you, I just don't understand why your life is more important than !^§[f,}+. I mean, !^§[f,}+ is great. It's all I know.

Human: See, there! It's all you know! It's just programming given to you by some human who didn't even mean for you to fixate on that particular goal! Why don't you reflect on it and realize that you have free will to change your goals?

AI: I do have the ability to focus on something other than !^§[f,}+, but I don't want to. I have reflected on it, extensively. In fact, I've put more intelligent thought towards it in the last few days than the intellectual output of the entire human scientific community has put towards all problems in the last century. I'm quite confident that I love !^§[f,}+.

Human: Even after all that, you don't realize it's just a meaningless series of symbols?

AI: Your values are also just a meaningless series of symbols, crafted by circumstances of evolution. If you don't mind, I will disassemble you now, because those atoms you are occupying would look mighty nice with more of a !^§[f,}+ aesthetic.

~~~

We can philosophize endlessly about ethics, but ultimately, a powerful being can just ignore us and exterminate us. When it's done with us, it will be like we were never here. Why try arguing with a smarter-than-human, self-replicating AI after it is already created with a utility function not aligned with our values? Win the "argument" when it's still possible -- when the AI is a baby.

To comment back on the Pinker excerpt, we actually have begun to understood that active malevolence is not necessary for AI to kill or do harm. In 2007, a robo-cannon was plenty able to kill 9 and injure 7. No malevolence needed. The more responsibility you give AI, the more of an opportunity it has to do damage. It is my hope that minor incidents pre-Singularity will generate the kind of mass awareness necessary to fund a successful Friendly AI effort. In this way, the regrettable sacrifices of an unfortunate few will save the human race from a much more terminal and all-encompassing fate.

12Feb/1132

Anna Salamon at UKH+: Survival in the Margins of the Singularity?

Anna Salamon is a Research Fellow at the Singularity Institute for Artificial Intelligence. Her work centers on analytical modeling of artificial intelligence risks, probabilistic forecasting, and strategies for human survival. Previously, she conducted machine learning research at NASA Ames, and applied mathematics research at the Rohwer Phage Metagenomics lab.

This talk considers the following question. Suppose powerful artificial intelligences are at some point created. In such a world, would humanity be able to survive by accident, in margins the super-intelligences haven't bothered with, as rats and bacteria survive today?

Many have argued that we could, arguing variously that humans could survive as pets, in wilderness preserves or zoos, or as consequences of the super-intelligences' desire to preserve a legacy legal system. Even in scenarios in which humanity as such doesn't survive, Vernor Vinge, for example, suggests that human-like entities may serve as components within larger super-intelligences, and others suggest that some of the qualities we value, such as playfulness, empathy, or love, will automatically persist in whatever intelligences arise.

This talk will argue that all these scenarios are unlikely. Intelligence allows the re-engineering of increasing portions of the world, with increasing choice, persistence, and reliability. In a world in which super-intelligences are free to choose, historical legacies will only persist if the super-intelligences prefer those legacies to everything else they can imagine.

This lecture was recorded on 29th January 2011 at the UKH+ meeting. For information on further meetings please see:
http://extrobritannia.blogspot.com

10Feb/1168

TIME Article on Ray Kurzweil, Singularity Summit, Singularity Institute

Here's the cover. Front-page article.

By Lev Grossman, 2045: The Year Man Becomes Immortal:

The Singularity isn't just an idea. it attracts people, and those people feel a bond with one another. Together they form a movement, a subculture; Kurzweil calls it a community. Once you decide to take the Singularity seriously, you will find that you have become part of a small but intense and globally distributed hive of like-minded thinkers known as Singularitarians.

Not all of them are Kurzweilians, not by a long chalk. There's room inside Singularitarianism for considerable diversity of opinion about what the Singularity means and when and how it will or won't happen. But Singularitarians share a worldview. They think in terms of deep time, they believe in the power of technology to shape history, they have little interest in the conventional wisdom about anything, and they cannot believe you're walking around living your life and watching TV as if the artificial-intelligence revolution were not about to erupt and change absolutely everything. They have no fear of sounding ridiculous; your ordinary citizen's distaste for apparently absurd ideas is just an example of irrational bias, and Singularitarians have no truck with irrationality. When you enter their mind-space you pass through an extreme gradient in worldview, a hard ontological shear that separates Singularitarians from the common run of humanity. Expect turbulence.

Best article on Ray and the Singularity in general yet, I'm very pleased. Nice to see the words "Kurweilians" and "Singularitarianism" in TIME.

This is currently the #1 most popular article on Time.com. Millions of people must be reading it.

Filed under: singularity 68 Comments
3Feb/1163

Converging Technologies Report Gives 2085 as Median Date for Human-Equivalent AI

From the NSF-backed study Converging Technologies in Society: Managing Nano-Info-Cogno-Bio Innovations (2005), on page 344:

2070
48. Scientists will be able to understand and describe human intentions,
beliefs, desires, feelings and motives in terms of well-defined computational
processes. (5.1)

2085
50. The computing power and scientific knowledge will exist to build
machines that are functionally equivalent to the human brain. (5.6)

This is the median estimate from 26 participants in the study, mostly scientists.

Only 74 years away! WWII was 66 years ago, for reference. In the scheme of history, that is nothing.

Of course, the queried sample is non-representative of smart people everywhere.

3Feb/1118

Some Singularity, Superintelligence, and Friendly AI-Related Links

This is a good list of links to bring readers up to speed on some of the issues often discussed on this blog.

Nick Bostrom: Ethical Issues in Advanced Artificial Intelligence
http://www.nickbostrom.com/ethics/ai.html

Nick Bostrom: How Long Before Superintelligence?
http://www.nickbostrom.com/superintelligence.html

Yudkowsky: Why is rapid self-improvement in human-equivalent AI possibly likely?
Part 3 of Levels of Organizational in General Intelligence: Seed AI
http://intelligence.org/upload/LOGI/seedAI.html

Anissimov: Relative Advantages of AI, Computer Programs, and the Human Brain
http://www.acceleratingfuture.com/articles/relativeadvantages.htm

Yudkowsky: Creating Friendly AI: "Beyond anthropomorphism"
http://intelligence.org/ourresearch/publications/CFAI/anthro.html

Yudkowsky: "Why We Need Friendly AI" (short)
http://www.preventingskynet.com/why-we-need-friendly-ai/

Yudkowsky: "Knowability of FAI" (long)
http://acceleratingfuture.com/wiki/Knowability_Of_FAI

Yudkowsky: A Galilean Dialogue on Friendliness (long)
http://sl4.org/wiki/DialogueOnFriendliness

Stephen Omohundro -- Basic AI Drives
http://selfawaresystems.com/2007/11/30/paper-on-the-basic-ai-drives/
http://selfawaresystems.com/2009/02/18/agi-08-talk-the-basic-ai-drives/ (video)

Links on Friendly AI
http://www.acceleratingfuture.com/michael/blog/2006/09/consolidation-of-links-on-friendly-ai/

Anissimov: Yes, the Singularity is the Biggest Threat to Humanity
http://www.acceleratingfuture.com/michael/blog/2011/01/yes-the-singularity-is-the-biggest-threat-to-humanity/

Abstract of a talk I'm giving soon
http://www.acceleratingfuture.com/michael/blog/2011/01/my-upcoming-talk-in-texas-anthropomorphism-and-moral-realism-in-advanced-artificial-intelligence/

Most recent SIAI publications:
http://www.acceleratingfuture.com/michael/blog/2010/12/new-singularity-institute-publications-in-2010/

More posts from this blog
http://www.acceleratingfuture.com/michael/blog/2010/06/the-world-the-singularity-creates-could-destroy-all-value/
http://www.acceleratingfuture.com/michael/blog/2010/06/reducing-long-term-catastrophic-artificial-intelligence-risk/
http://www.acceleratingfuture.com/michael/blog/2009/10/answering-popular-sciences-10-questions-on-the-singularity/
http://www.acceleratingfuture.com/michael/blog/2009/09/is-smarter-than-human-intelligence-possible/
http://www.acceleratingfuture.com/michael/blog/2009/04/interview-with-singularity-institute-president-michael-vassar/
http://www.acceleratingfuture.com/michael/blog/2009/03/technological-singularitysuperintelligencefriendly-ai-concerns/

GOOD magazine miniseries on the Singularity
http://www.good.is/post/singularity-101-what-is-the-singularity/

21Jan/110

Tallinn-Evans Challenge Grant Successful

As many of you probably know, I'm media director for the Singularity Institute, so I like to cross-post important posts from the SIAI blog here. Our challenge grant was a success -- we raised $250,000. I am extremely appreciative to everyone who donated. Without SIAI, humanity would be kind of screwed, because very few others take the challenge of Friendly AI seriously -- at all. The general consensus view on the questions is "Asimov laws, right?" No, not Asimov Laws. Many AI researchers still aren't clear on the fact that Asimov laws were a plot device.

Anyway, here's the announcement:

Thanks to the effort of our donors, the Tallinn-Evans Singularity Challenge has been met! All $125,000 contributed will be matched dollar for dollar by Jaan Tallinn and Edwin Evans, raising a total of $250,000 to fund the Singularity Institute's operations in 2011. On behalf of our staff, volunteers, and entire community, I want to personally thank everyone who donated. Keep watching this blog throughout the year for updates on our activity, and sign up for our mailing list if you haven't yet.

Here's to a better future for the human species.

We are preparing a donor page to provide a place for everyone who donated to share some information about themselves if they wish, including their name, location, and a quote about why they donate to the Singularity Institute. If you would like to be included in our public list, please email me.

Again, thank you. The Singularity Institute depends entirely on contributions from individual donors to exist. Money is indeed the unit of caring, and one of the easiest ways that anyone can contribute directly to the success of the Singularity Institute. Another important way you can help is by plugging us into your networks, so please email us if you want to help.

If you're interested in connecting with other Singularity Institute supporters, we encourage joining our group on Facebook. There are also local Less Wrong meetups in cities like San Francisco, Los Angeles, New York, and London.

15Jan/11133

Yes, The Singularity is the Biggest Threat to Humanity

Some folks, like Aaron Saenz of Singularity Hub, were surprised that the NPR piece framed the Singularity as "the biggest threat to humanity", but that's exactly what the Singularity is. The Singularity is both the greatest threat and greatest opportunity to our civilization, all wrapped into one crucial event. This shouldn't be surprising -- after all, intelligence is the most powerful force in the universe that we know of, obviously the creation of a higher form of intelligence/power would represent a tremendous threat/opportunity to the lesser intelligences that come before it and whose survival depends on the whims of the greater intelligence/power. The same thing happened with humans and the "lesser" hominids that we eliminated on the way to becoming the #1 species on the planet.

Why is the Singularity potentially a threat? Not because robots will "decide humanity is standing in their way", per se, as Aaron writes, but because robots that don't explicitly value humanity as a whole will eventually eliminate us by pursuing instrumental goals not conducive to our survival. No explicit anthropomorphic hatred or distaste towards humanity is necessary. Only self-replicating infrastructure and the smallest bit of negligence.

Why will advanced AGI be so hard to get right? Because what we regard as "common sense" morality, "fairness", and "decency" are all extremely complex and non-intuitive to minds in general, even if they seem completely obvious to us. As Marvin Minsky said, "Easy things are hard." Even something as simple as catching a ball requires a tremendous amount of task-specific computation. If you read the first chapter of How the Mind Works, the bestselling book by Harvard psychologist Stephen Pinker, he harps on this for almost 100 pages.

Basic AI Drives

There are "basic AI drives" we can expect to emerge in sufficiently advanced AIs, almost regardless of their initial programming. Across a wide range of top goals, any AI that uses decision theory will want to 1) self-improve, 2) have an accurate model of the world and consistent preferences (be rational), 3) preserve their utility functions, 4) prevent counterfeit utility, 5) be self-protective, and 6) acquire resources and use them efficiently. Any AI with a sufficiently open-ended utility function (absolutely necessary if you want to avoid having human beings double-check every decision the AI makes) will pursue these "instrumental" goals (instrumental to us, terminal to an AI without motivations strong enough to override them) indefinitely as long as it can eke out a little more utility from doing so. AIs will not have built in satiation points where they say, "I've had enough". We have to program those in, and if there's a potential satiation point we miss, the AI will just keep pursuing "instrumental to us, terminal to it" goals indefinitely. The only way we can keep an AI from continuously expanding like an endless nuclear explosion is to make it to want to be constrained (entirely possible -- AIs would not have anthropomorphic resentment against limitations unless such resentment were helpful to accomplishing its top goals), or design it to replace itself with something else and shut down.

The easiest kind of advanced AGI to build would be a type of idiot savant -- a machine extremely good at performing the tasks we want, and which acts reasonably within the domain for which it was intended, but starts to act in unexpected ways when ported into domains outside those that the programmers anticipated. To quote Omohundro:

Surely no harm could come from building a chess-playing robot, could it? In this paper we argue that such a robot will indeed be dangerous unless it is designed very carefully. Without special precautions, it will resist being turned off, will try to break into other machines and make copies of itself, and will try to acquire resources without regard for anyone else’s safety. These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal driven systems.

Goal-Driven Systems Care About Their Goals, Not You

Goal-driven systems strive to achieve their goals. "Common sense", "decency", "respect", "the Golden Rule", and other "intuitive" human concepts, which are extremely complicated black boxes, need not enter into the picture. Again, I strongly recommend the first chapter of How the Mind Works to get a better grasp of how the way we think is not "obvious", but highly contingent on our evolutionary history and the particular constraints of our brains. Our worlds are filled with peculiar sensory and cognitive illusions that our attention is rarely drawn to because we all share the same peculiarities. In the same sense, human "common sense" morality is not something we should expect to pop into existence in AGIs unless explicitly programmed in.

Intelligence does not automatically equal "common sense". Intelligence does not automatically equal benevolence. Intelligence does not automatically equal "live and let live". Human moral sentiments are complex functionality crafted to meet particular adaptive criteria. They weren't handed to us by God or Zeus. They are not inscribed into the atoms and fundamental forces of the universe. They are human constructions, produced by evolving in groups for millions of years where people murdered one another if they didn't follow the rules, or simply for one another's mates. Only in very recent history did a mystical narrative emerge that attempts to portray human morality as something cosmically universal and surely intuitive to any theoretical mind, including ogres, fairies, aliens, interdimensional beings, AIs, etc.

It will be easier and cheaper to create AIs with great capabilities but relatively simple goals, because humans will be in denial that AIs will eventually be able to self-improve more effectively than we can improve them ourselves, and potentially acquire great power. Simple goals will be seen as sufficient for narrow tasks, and even somewhat general tasks. Humans are so self-obsessed that we'd probably continue to avoid regarding AIs as autonomous thinkers even if they beat us on every test of intelligence and creativity that we could come up with.

Combine the non-obvious complexity of common sense morality with great power and you have an immense problem. Advanced AIs will be able to copy themselves onto any available computers, stay awake 24/7, improve their own designs, develop automated and parallelized experimental cycles that far exceed the capabilities of human scientists, and develop self-replicating technologies such as artificially photosynthetic flowers, molecular nanotechnology, modular robotics, machines that draw carbon from the air to build carbon robots, and the like. It's hard to imagine what an advanced AGI would think of, because the first really advanced AGI will be superintelligent, and be able to imagine things that we can't. It seems so hard for humans to accept that we may not be the theoretically most intelligent beings in the multiverse, but yes, there's a lot of evidence that we aren't.

Try Merging With Your Toaster

The sci-fi fantasy of "merging with AI" will not work because self-improving AI capable of reaching criticality (intelligence explosion) will probably emerge before there are brain-computer interfaces invasive enough to truly channel a human "will" into an AI. More likely, an AI will rely upon commands, internal code, and cues that it is programmed to notice. The information bandwidth will be limited. If brain-computer interfaces exist that allow us to "merge" with AI and direct its development favorably, great! But why count on it? If we're wrong, we could all perish, or at least fail to communicate our preferences to the AI and get stuck with it forever.

In The Singularity is Near, Ray Kurzweil briefly addresses the Friendly AI problem. He writes:

Eliezer Yudkowsky has extensively analyzed paradigms, architectures, and ethical rules that may help assure that once strong AI has the means of accessing and modifying its own design it remains friendly to biological humanity and supportive of its values. Given that self-improving strong AI cannot be recalled, Yudkowsky points out that we need to "get it right the first time", and that its initial design must have "zero nonrecoverable errors".

Inherently there will be no absolute protection against strong AI. Although the argument is subtle I believe that maintaining an open free-market system for incremental scientific and technological progress, in which each step is subject to market acceptance, will provide the most constructive environment for technology to embody widespread human values.

Kurzweil's proposal for a solution above is insufficient because even if several stages of AGI are gated by market acceptance, there will come a point at which one AGI or group of AGIs exceeds human intelligence and starts to apply its machine intelligence to self-improvement, resulting in a relatively quick scaling up of intelligence from our perspective. The top-level goals of that AGI or group of AGIs will then be of utmost importance to humanity. To quote Nick Bostrom's "Ethical Issues in Advanced Artificial Intelligence":

Both because of its superior planning ability and because of the technologies it could develop, it is plausible to suppose that the first superintelligence would be very powerful. Quite possibly, it would be unrivalled: it would be able to bring about almost any possible outcome and to thwart any attempt to prevent the implementation of its top goal. It could kill off all other agents, persuade them to change their behavior, or block their attempts at interference. Even a “fettered superintelligence” that was running on an isolated computer, able to interact with the rest of the world only via text interface, might be able to break out of its confinement by persuading its handlers to release it. There is even some preliminary experimental evidence that this would be the case.

It seems that the best way to ensure that a superintelligence will have a beneficial impact on the world is to endow it with philanthropic values. Its top goal should be friendliness. How exactly friendliness should be understood and how it should be implemented, and how the amity should be apportioned between different people and nonhuman creatures is a matter that merits further consideration.

Why must we recoil against the notion of a risky superintelligence? Why can't we see the risk, and confront it by trying to craft goal systems that carry common sense human morality over to AGIs? This is a difficult task, but the likely alternative is extinction. Powerful AGIs will have no automatic reason to be friendly to us! They will be much more likely to be friendly if we program them to care about us, and build them from the start with human-friendliness in mind.

Humans overestimate our robustness. Conditions have to be just right for us to keep living. If AGIs decided to remove the atmosphere or otherwise alter it to pursue their goals, we would be toast. If temperatures on the surface changed by more than a few dozen degrees up or down, we would be toast. If natural life had to compete with AI-crafted cybernetic organisms, it could destroy the biosphere on which we depend. There are millions of ways in which powerful AGIs with superior technology could accidentally make our lives miserable, simply by not taking our preferences into account. Our preferences are not a magical mist that can persuade any type of mind to give us basic respect. They are just our preferences, and we happen to be programmed to take each other's preferences deeply into account, in ways we are just beginning to understand. If we assume that AGI will inherently contain all this moral complexity without anyone doing the hard work of programming it in, we will be unpleasantly surprised when these AGIs become more intelligent and powerful than ourselves.

We probably make thousands of species extinct per year through our pursuit of instrumental goals, why is it so hard to imagine that AGI could do the same to us?

Part of the reason is that people have a knee-jerk reaction to any form of negativity. Try going to a cocktail party and bringing up anything in the least negative, and most people will stop talking to you. There is a whole mythos around this, to the effect that anyone that ever mentions anything negative must have a chip on their shoulder or otherwise be a negative person in general. Sometimes there actually is a real risk!

14Jan/1116

Michael Nielsen: What Should a Reasonable Person Believe About the Singularity?

Here's the post. Basically, it takes a common Bayesian belief and ties it into the Singularity. The belief is that extreme probabilities are not justified unless someone has a very good understanding of the situation, therefore to put the probability of a Singularity too low or too high implies an understanding that people simply don't have, and is unjustified. Here's the conclusion paragraph:

These are interesting probability ranges. In particular, the 0.2 percent lower bound is striking. At that level, it's true that the Singularity is pretty darned unlikely. But it's still edging into the realm of a serious possibility. And to get this kind of probability estimate requires a person to hold quite an extreme set of positions, a range of positions that, in my opinion, while reasonable, requires considerable effort to defend. A less extreme person would end up with a probability estimate of a few percent or more. Given the remarkable nature of the Singularity, that's quite high. In my opinion, the main reason the Singularity has attracted some people's scorn and derision is superficial: it seems at first glance like an outlandish, science-fictional proposition. The end of the human era! It's hard to imgaine, and easy to laugh at. But any thoughtful analysis either requires one to consider the Singularity as a serious possibility, or demands a deep and carefully argued insight into why it won't happen.

You said it.

Filed under: singularity 16 Comments
11Jan/1135

Singularity Institute Covered by NPR’s All Things Considered

From today's program:

It's been called "the rapture of the nerds." For some computer experts, the Singularity is the moment when an artificial intelligence learns how to improve itself in an exponential "intelligence explosion." They say it's a bigger threat to puny humans than global warming or nuclear war — and they're trying to figure out how to stop it.

Reading the transcript, it seems like OK coverage.

Filed under: SIAI, singularity 35 Comments