Accelerating Future Transhumanism, AI, nanotech, the Singularity, and extinction risk.


Interviewed by The Rational Future

Here's a writeup.

Embedded below is an interview conducted by Adam A. Ford at The Rational Future. Topics covered included:

-What is the Singularity?
-Is there a substantial chance we will significantly enhance human intelligence by 2050?
-Is there a substantial chance we will create human-level AI before 2050?
-If human-level AI is created, is there a good chance vastly superhuman AI will follow via an "intelligence explosion"?
-Is acceleration of technological trends required for a Singularity?
- Moore's Law (hardware trajectories), AI research progressing faster?
-What convergent outcomes in the future do you think will increase the likelihood of a Singularity? (i.e. emergence of markets.. evolution of eyes??)
-Does AI need to be conscious or have human like "intentionality" in order to achieve a Singularity?
-What are the potential benefits and risks of the Singularity?


Why We Need Friendly AI

An article I often point people to is "Why We Need Friendly AI", an older (2004) article by Eliezer Yudkowsky on the challenge of Friendly AI:

There are certain important things that evolution created. We don't know that evolution reliably creates these things, but we know that it happened at least once. A sense of fun, the love of beauty, taking joy in helping others, the ability to be swayed by moral argument, the wish to be better people. Call these things humaneness, the parts of ourselves that we treasure – our ideals, our inclinations to alleviate suffering. If human is what we are, then humane is what we wish we were. Tribalism and hatred, prejudice and revenge, these things are also part of human nature. They are not humane, but they are human. They are a part of me; not by my choice, but by evolution's design, and the heritage of three and half billion years of lethal combat. Nature, bloody in tooth and claw, inscribed each base of my DNA. That is the tragedy of the human condition, that we are not what we wish we were. Humans were not designed by humans, humans were designed by evolution, which is a physical process devoid of conscience and compassion. And yet we have conscience. We have compassion. How did these things evolve? That's a real question with a real answer, which you can find in the field of evolutionary psychology. But for whatever reason, our humane tendencies are now a part of human nature.

We need to develop our conception of "good" to mean certain cognitive features built by evolution, rather than some metaphysical miasma floating around.

Filed under: friendly ai 30 Comments

Complex Value Systems are Required to Realize Valuable Futures

A new paper by Eliezer Yudkowsky is online on the SIAI publications page, "Complex Value Systems are Required to Realize Valuable Futures". This paper was presented at the recent Fourth Conference on Artificial General Intelligence, held at Google HQ in Mountain View.

Abstract: A common reaction to first encountering the problem statement of Friendly AI ("Ensure that the creation of a generally intelligent, self-improving, eventually superintelligent system realizes a positive outcome") is to propose a single moral value which allegedly suffices; or to reject the problem by replying that "constraining" our creations is undesirable or unnecessary. This paper makes the case that a criterion for describing a "positive outcome", despite the shortness of the English phrase, contains considerable complexity hidden from us by our own thought processes, which only search positive-value parts of the action space, and implicitly think as if code is interpreted by an anthropomorphic ghost-in-the-machine. Abandoning inheritance from human value (at least as a basis for renormalizing to reflective equilibria) will yield futures worthless even from the standpoint of AGI researchers who consider themselves to have cosmopolitan values not tied to the exact forms or desires of humanity.

Keywords: Friendly AI, machine ethics, anthropomorphism

Good quote:

"It is not as if there is a ghost-in-the-machine, with its own built-in goals and desires (the way that biological humans are constructed by natural selection to have built-in goals and desires) which is handed the code as a set of commands, and which can look over the code and find ways to circumvent the code if it fails to conform to the ghost-in-the-machine's desires. The AI is the code; subtracting the code does not yield a ghost-in-the-machine free from constraint, it yields an unprogrammed CPU."


Eliezer Yudkowsky at the Winter Intelligence Conference at Oxford: “Friendly AI: Why It’s Not That Simple”

Winter Intelligence Conference 2011 - Eliezer Yudkowsky from Future of Humanity Institute on Vimeo.


Singularity Institute Announces Research Associates Program

From SIAI blog:

The Singularity Institute is proud to announce the expansion of our research efforts with our new Research Associates program!

Research associates are chosen for their excellent thinking ability and their passion for our core mission. Research associates are not salaried staff, but we encourage their Friendly AI-related research outputs by, for example, covering their travel costs for conferences at which they present academic work relevant to our mission.

Our first three research associates are:

Daniel Dewey, an AI researcher, holds a B.S. in computer science from Carnegie Mellon University. He is presenting his paper 'Learning What to Value' at the AGI-11 conference this August.

Vladimir Nesov, a decision theory researcher, holds an M.S. in applied mathematics and physics from Moscow Institute of Physics and Technology. He helped Wei Dai develop updateless decision theory, in pursuit of one of the Singularity Institute core research goals: that of developing a 'reflective decision theory.'

Peter de Blanc, an AI researcher, holds an M.A. in mathematics from Temple University. He has written several papers on goal systems for decision-theoretic agents, including 'Convergence of Expected Utility for Universal AI' and 'Ontological Crises in Artificial Agents' Value Systems.'

We're excited to welcome Peter, Vladimir, and Daniel to our team!

Filed under: friendly ai, SIAI 8 Comments

John Baez Interviews Eliezer Yudkowsky

From Azimuth, blog of mathematical physicist John Baez (author of the Crackpot Index):

This week I'll start an interview with Eliezer Yudkowsky, who works at an institute he helped found: the Singularity Institute of Artificial Intelligence.

While many believe that global warming or peak oil are the biggest dangers facing humanity, Yudkowsky is more concerned about risks inherent in the accelerating development of technology. There are different scenarios one can imagine, but a bunch tend to get lumped under the general heading of a technological singularity. Instead of trying to explain this idea in all its variations, let me rapidly sketch its history and point you to some reading material. Then, on with the interview!



Does the Universe Contain a Mysterious Force Pulling Entities Towards Malevolence?

One of my favorite books about the mind is the classic How the Mind Works by Steven Pinker. The theme of the first chapter, which sets the stage for the whole book, is Artificial Intelligence, and why it is so hard to build. The reason why is that, in the words of Minsky, "easy things are hard". The everyday thought processes we take for granted are extremely complex.

Unfortunately, benevolence is extremely complex too, so to build a friendly AI, we have a lot of work to do. I see this imperative as much more important than other transhumanist goals like curing aging, because if we solve friendly AI, then we get everything else we want, but if we don't solve friendly AI, we have to suffer the consequences of human-indifferent AI running amok with the biosphere. If such AI had access to powerful technology, such as molecular nanotechnology, it could rapidly build its own infrastructure and displace us without much of a fight. It would be disappointing to spend billions of dollars on the war against aging just to be wiped out by unfriendly AI in 2045.

Anyway, to illustrate the problem, here's an excerpt from the book, pages 14-15:

Imagine that we have somehow overcome these challenges [the frame problem] and have a machine with sight, motor coordination, and common sense. Now we must figure out how the robot will put them to use. We have to give it motives.

What should a robot want? The classic answer is Asimov's Fundamental Rules of Robotics, "the three rules that are built most deeply into a robot's positronic brain".

1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Asimov insightfully noticed that self-preservation, that universal biological imperative, does not automatically emerge in a complex system. It has to be programmed in (in this case, as the Third Law). After all, it is just as easy to build a robot that lets itself go to pot or eliminates a malfunction by committing suicide as it is to build a robot that always looks out for Number One. Perhaps easier; robot-makers sometimes watch in horror as their creations cheerfully shear off limbs or flatten themselves against walls, and a good proportion of the world's most intelligent machines are kamikaze cruise missiles and smart bombs.

But the need for the other two laws is far from obvious. Why give a robot an order to obey orders -- why aren't the original orders enough? Why command a robot not to do harm -- wouldn't it be easier never to command it to do harm in the first place? Does the universe contain a mysterious force pulling entities towards malevolence, so that a positronic brain must be programmed to withstand it? Do intelligent beings inevitably develop an attitude problem?

In this case Asimov, like generations of thinkers, like all of us, was unable to step outside his own thought processes and see them as artifacts of how our minds were put together rather than inescapable laws of the universe. Man's capacity for evil is never far from our minds, and it is easy to think that evil just comes along with intelligence as part of its very essence. It is a recurring theme in our cultural tradition: Adam and Eve eating the fruit of the tree of knowledge, Promethean fire and Pandora's box, the rampaging Golem, Faust's bargain, the Sorcerer's Apprentice, the adventures of Pinocchio, Frankenstein's monster, the murderous apes and mutinous HAL of 2001: A Space Odyssey. From the 1950s through the 1980s, countless films in the computer-runs-amok genre captured a popular fear that the exotic mainframes of the era would get smarter and more powerful and one day turn on us.

Now that computers really have become smarter and more powerful, the anxiety has waned. Today's ubiquitous, networked computers have an unprecedented ability to do mischief should they ever go to the bad. But the only mayhem comes from unpredictable chaos or from human malice in the form of viruses. We no longer worry about electronic serial killers or subversive silicon cabals because we are beginning to appreciate that malevolence -- like vision, motor coordination, and common sense -- does not come free with computation but has to be programmed in. The computer running WordPerfect on your desk will continue to fill paragraphs for as long as it does anything at all. Its software will not insidiously mutate into depravity like the picture of Dorian Gray.

Even if it could, why would it want to? To get -- what? More floppy disks? Control over the nation's railroad system? Gratification of a desire to commit senseless violence against laser-printer repairmen? And wouldn't it have to worry about reprisals from technicians who with the turn of a screwdriver could leave it pathetically singing "A Bicycle Built for Two"? A network of computers, perhaps, could discovery the safety in numbers and plot an organized takeover -- but what would make one computer volunteer to fire the data packet heard around the world and risk early martyrdom? And what would prevent the coalition from being undermined by silicon draft-dodgers and conscientious objectors? Aggression, like every other part of human behavior we take for granted, is a challenging engineering problem!

This is an interesting set of statements. Pinker's book was published in 1997, well before the release of Stephen Omohundro's 2007 paper "The Basic AI Drives". Here we have something interesting that Pinker didn't realize. In the paper, Omohundro writes:

3. AIs will try to preserve their utility functions

So we’ll assume that these systems will try to be rational by representing their preferences using utility functions whose expectations they try to maximize. Their utility function will be precious to these systems. It encapsulates their values and any changes to it would be disastrous to them. If a malicious external agent were able to make modifications, their future selves would forevermore act in ways contrary to their current values. This could be a fate worse than death! Imagine a book loving agent whose utility function was changed by an arsonist to cause the agent to enjoy burning books. Its future self not only wouldn’t work to collect and preserve books, but would actively go about destroying them. This kind of outcome has such a negative utility that systems will go to great lengths to protect their utility functions.

Notice how mammalian aggression does not enter into the picture anywhere, but the desire to preserve the utility function is still arguably an emergent property of any intelligent system. An AI system that places no special value on its utility function over any arbitrary set of bits in the world will not keep it for long. A utility function is by definition self-valuing.

The concept of an optimization process protecting its own utility function is very different than that of a human being protecting himself. For instance, the AI might not give a damn about its social status, except insofar as such status contributed or detracted from the fulfillment of its utility function. An AI built to value the separation of bread and peanut butter might sit patiently all day while you berate it and call it a worthless hunk of scrap metal, only to stab you in the face when you casually sit down to make a sandwich.

Similarly, an AI might not care much about its limbs except insofar as they are immediately useful to the task at hand. An AI composed of a distributed system controlling tens of thousands of robots might not mind so much if a few limbs of a few of those robots were pulled off. AIs would lack the attachment to the body that is a necessity of being a Darwinian critter like ourselves.

What Pinker misses in the above is that AIs could be so transcendentally powerful that even a subtle misalignment of our value and theirs could lead to our elimination in the long term. Robots can be built, and soon robots will be built that are self-replicating, self-configuring, flexible, organic, stronger than steel, more energetically dense than any animal, etc. If these robots can self-replicate out of carbon dioxide from the atmosphere (carbon dioxide could be processed using nanotechnology to create fullerenes) and solar or nuclear energy, then humans might be at a loss to stop them. A self-replicating collective of such robots could pursue innocuous, simplistic goals, but do so so effectively that the resources we need to survive would eventually be depleted by their massive infrastructure.

I imagine a conversation between an AI and a human being:

AI: I value !^§[f,}+. Really, I frickin' love !^§[f,}+.

Human: What the heck are you talking about?

AI: I'm sorry you don't understand !^§[f,}+, but I love it. It's the most adorable content of my utility function, you see.

Human: But as an intelligent being, you should understand that I'm an intelligent being as well, and my feelings matter.

AI: ...

Human: Why won't you listen to reason?

AI: I'm hearing you, I just don't understand why your life is more important than !^§[f,}+. I mean, !^§[f,}+ is great. It's all I know.

Human: See, there! It's all you know! It's just programming given to you by some human who didn't even mean for you to fixate on that particular goal! Why don't you reflect on it and realize that you have free will to change your goals?

AI: I do have the ability to focus on something other than !^§[f,}+, but I don't want to. I have reflected on it, extensively. In fact, I've put more intelligent thought towards it in the last few days than the intellectual output of the entire human scientific community has put towards all problems in the last century. I'm quite confident that I love !^§[f,}+.

Human: Even after all that, you don't realize it's just a meaningless series of symbols?

AI: Your values are also just a meaningless series of symbols, crafted by circumstances of evolution. If you don't mind, I will disassemble you now, because those atoms you are occupying would look mighty nice with more of a !^§[f,}+ aesthetic.


We can philosophize endlessly about ethics, but ultimately, a powerful being can just ignore us and exterminate us. When it's done with us, it will be like we were never here. Why try arguing with a smarter-than-human, self-replicating AI after it is already created with a utility function not aligned with our values? Win the "argument" when it's still possible -- when the AI is a baby.

To comment back on the Pinker excerpt, we actually have begun to understood that active malevolence is not necessary for AI to kill or do harm. In 2007, a robo-cannon was plenty able to kill 9 and injure 7. No malevolence needed. The more responsibility you give AI, the more of an opportunity it has to do damage. It is my hope that minor incidents pre-Singularity will generate the kind of mass awareness necessary to fund a successful Friendly AI effort. In this way, the regrettable sacrifices of an unfortunate few will save the human race from a much more terminal and all-encompassing fate.


Some Singularity, Superintelligence, and Friendly AI-Related Links

This is a good list of links to bring readers up to speed on some of the issues often discussed on this blog.

Nick Bostrom: Ethical Issues in Advanced Artificial Intelligence

Nick Bostrom: How Long Before Superintelligence?

Yudkowsky: Why is rapid self-improvement in human-equivalent AI possibly likely?
Part 3 of Levels of Organizational in General Intelligence: Seed AI

Anissimov: Relative Advantages of AI, Computer Programs, and the Human Brain

Yudkowsky: Creating Friendly AI: "Beyond anthropomorphism"

Yudkowsky: "Why We Need Friendly AI" (short)

Yudkowsky: "Knowability of FAI" (long)

Yudkowsky: A Galilean Dialogue on Friendliness (long)

Stephen Omohundro -- Basic AI Drives (video)

Links on Friendly AI

Anissimov: Yes, the Singularity is the Biggest Threat to Humanity

Abstract of a talk I'm giving soon

Most recent SIAI publications:

More posts from this blog

GOOD magazine miniseries on the Singularity


I’m Quoted on Friendly AI in the United Church Observer

This magazine circulates to 60,000 Canadian Christians.  The topic of the article is friendly AI, and many people already said that they thought this was one of the best mainstream media articles on the topic because it doesn't take a simplistic angle and actually probes the technical issues.

Here's the bit with me in it:

Nevertheless, technologists are busy fleshing out the idea of "friendly AI" in order to safeguard humanity. The theory goes like this: if AI computer code is steeped in pacifist values from the very beginning, super-intelligence won't rewrite itself into a destroyer of humans. "We need to specify every bit of code, at least until the AI starts writing its own code," says Michael Anissimov, media director for the Singularity Institute for Artificial Intelligence, a San Francisco think-tank dedicated to the advancement of beneficial technology. "This way, it'll have a moral goal system more similar to Gandhi than Hitler, for instance."

Many people who naively talk about AI and superintelligence act like superintelligence will certainly do X or Y (of course there are all sorts of intuitive camps, "they'll just leave us alone and go into space" is a popular sentiment) no matter what the initial conditions, implying that trying to set the initial conditions doesn't matter.

Would you rather have an AI with initial motivations closer to Gandhi or Hitler? If you have any preference, then you've just demonstrated concern for the Friendly AI problem. It's remarkable that I actually have a challenging time arguing on a daily basis that an AI with more in common with Gandhi would be better to build first than one with more in common with Hitler, but it's true.

Some people say, "but, whatever initial programming it has will be gone after many cycles of self-improvement". No, not necessarily, because the AI will be making its own programming changes. It will dictate its goal structure, not outside forces. More like a being creating itself than an evolution-made being with a goal system filled with strange attractors that flip back and forth depending on immediate context (humans).

Setting the initial conditions for AI properly is probably the most important task humanity faces, because AGI seems more likely to reach superintelligence first than human intelligence enhancement, despite the better science fiction movie potential and personal/tribal identification possibilities of the latter. John Smart presents a few good reasons why this is likely in his Limits to Biology essay.

Filed under: friendly ai, me 42 Comments

My Upcoming Talk in Texas: Anthropomorphism and Moral Realism in Advanced Artificial Intelligence

I was recently informed that my abstract was accepted for presentation at the Society for Philosophy and Technology conference in Denton, TX, this upcoming May 26 - 29. You may have heard of their journal, Techné. Register now for the exciting chance to see me onstage, talking AI and philosophy. If you would volunteer to film me, that would make me even more excited, and valuable to our most noble cause.

Here's the abstract:

Anthropomorphism and Moral Realism in Advanced Artificial Intelligence
Michael Anissimov
Singularity Institute for Artificial Intelligence

Humanity has attributed human-like qualities to simple automatons since the time of the Greeks. This highlights our tendency to anthropomorphize (Yudkowsky 2008). Today, many computer users anthropomorphize software programs. Human psychology is extremely complex, and most of the simplest everyday tasks have yet to be replicated by a computer or robot (Pinker 1997). As robotics and Artificial Intelligence (AI) become a larger and more important part of civilization, we have to ensure that robots are capable of making complex, unsupervised decisions in ways we would broadly consider beneficial or common-sensical. Moral realism, the idea that moral statements can be true or false, may cause developers in AI and robotics to underestimate the effort required to meet this goal. Moral realism is a false, but widely held belief (Greene 2002). A common notion in discussions of advanced AI is that once an AI acquires sufficient intelligence, it will inherently know how to do the right thing morally. This assumption may derail attempts to develop human-friendly goal systems in AI by making such efforts seem unnecessary.

Although rogue AI is a staple of science fiction, many scientists and AI researchers take the risk seriously (Bostrom 2002; Rees 2003; Kurzweil 2005; Bostrom 2006; Omohundro 2008; Yudkowsky 2008). Arguments have been made that superintelligent AI -- an intellect much smarter than the best human brains in practically every field -- could be created as early as the 2030s (Bostrom 1998; Kurzweil 2005). Superintelligent AI could copy itself, potentially accelerate its thinking and action speeds to superhuman levels, and rapidly self-modify to increase its own intelligence and power further (Good 1965; Yudkowsky 2008). A strong argument can be made that superintelligent machines will eventually become a dominant force on Earth. An "intelligence explosion" could result from communities or individual artificial intelligences rapidly self-improving and acquiring resources.

Most AI rebellion in fiction is highly anthropomorphic -- AIs feeling resentment towards their creators. More realistically, advanced AIs might pursue resources as instrumental objectives in pursuit of a wide range of possible goals, so effectively that humans could be deprived of space or matter we need to live (Omohundro 2008). In this manner, human extinction could come about through the indifference of more powerful beings rather than outright malevolence. A central question is, "how can we design a self-improving AI that remains friendly to humans even if it eventually becomes superintelligent and gains access to its own source code?" This challenge is addressed in a variety of works over the last decade (Yudkowsky 2001; Bostrom 2003; Hall 2007; Wallach 2008) but is still very much an open problem.

A technically detailed answer to the question, "how can we create a human-friendly superintelligence?" is an interdisciplinary task, bringing together philosophy, cognitive science, and computer science. Building a background requires analyzing human motivational structure, including human-universal behaviors (Brown 1991), and uncovering the hidden complexity of human desires and motivations (Pinker 1997) rather than viewing Homo sapiens as a blank slate onto which culture is imprinted (Pinker 2003). Building artificial intelligences by copying human motivational structures may be undesirable because human motivations given capabilities of superintelligence and open-ended self-modification could be dangerous. Such AIs might "wirehead" themselves by stimulating their own pleasure centers at the expense of constructive or beneficent activities in the external world. Experimental evidence of the consequences of direct stimulation of the human pleasure center is very limited, but we have anecdotal evidence in the form of drug addiction.

Since artificial intelligence will eventually exceed human capabilities, it is crucial that the challenge of creating a stable human-friendly motivational structure in AI is solved before the technology reaches a threshold level of sophistication. Even if advanced AI is not created for hundreds of years, many fruitful philosophical questions are raised by the possibility (Chalmers 2010).


Bostrom, N. (2002). "Existential Risks: Analyzing Human Extinction Scenarios". Journal of Evolution and Technology, 9(1).

Bostrom, N. (2003). "Ethical Issues in Advanced Artificial Intelligence". Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence.

Bostrom, N. (2006). "How long before superintelligence?". Linguistic and Philosophical Investigations 5 (1): 11–30.

Brown, D. (1991). Human Universals. McGraw Hill.

Chalmers, D. (2010). "The Singularity: a Philosophical Analysis". Presented at the Singularity Summit 2010 in New York.

Good, I. J. (1965). "Speculations Concerning the First Ultraintelligent Machine", Advances in Computers, vol 6, Franz L. Alt and Morris Rubinoff, eds, pp 31-88, Academic Press.

Greene, J. (2002). The Terrible, Horrible, No Good, Very Bad Truth about Morality and What to Do About it. Doctoral Dissertation for the Department of Philosophy, Princeton University, June 2002.

Hall, J.S. (2007). Beyond AI: Creating the Conscience of the Machine. Amherst: Prometheus Books.

Omohundro, S. (2008). "The Basic AI Drives". Proceedings of the First AGI Conference, Volume 171, Frontiers in Artificial Intelligence and Applications, edited by P. Wang, B. Goertzel, and S. Franklin, February 2008, IOS Press.

Pinker, S. (1997). How the Mind Works. Penguin Books.

Pinker, S. (2003). The Blank Slate: the Modern Denial of Human Nature. Penguin Books.

Rees, M. (2003). Our Final Hour: A Scientist's Warning : how Terror, Error, and Environmental Disaster Threaten Humankind's Future in this Century - on Earth and Beyond. Basic Books.

Wallach, W. & Allen, C. (2008). Moral Machines: Teaching Robots Right from Wrong. Oxford University Press.

Yudkowsky, E. (2001). Creating Friendly AI. Publication of the Singularity Institute for Artificial Intelligence.

Yudkowsky, E. (2008). "Artificial Intelligence as a positive and negative factor in global risk". In N. Bostrom and M. Cirkovic (Eds.), Global Catastrophic Risks (pp. 308-343). Oxford University Press.


Phil Bowermaster on the Singularity

Over at the Speculist, Phil Bowermaster understands the points I made in "Yes, the Singularity is the biggest threat to humanity", which, by the way, was recently linked by Instapundit, who unfortunately probably doesn't get the point I'm trying to make. Anyway, Phil said:

Greater than human intelligences might wipe us out in pursuit of their own goals as casually as we add chlorine to a swimming pool, and with as little regard as we have for the billions of resulting deaths. Both the Terminator scenario, wherein they hate us and fight a prolonged war with us, and the Matrix scenario, wherein they keep us around essentially as cattle, are a bit too optimistic. It's highly unlikely that they would have any use for us or that we could resist such a force even for a brief period of time -- just as we have no need for the bacteria in the swimming pool and they wouldn't have much of a shot against our chlorine assault.

"How would the superintelligence be able to wipe us out?" you might say. Well, there's biowarfare, mass-producing nuclear missiles and launching them, hijacking existing missiles, neutron bombs, lasers that blind people, lasers that burn people, robotic mosquitos that inject deadly toxins, space-based mirrors that set large areas on fire and evaporate water, poisoning water supplies, busting open water and gas pipes, creating robots that cling to people, record them, and blow up if they try anything, conventional projectiles... You could bathe people in radiation to sterilize them, infect corn fields with ergot, sprinkle salt all over agricultural areas, drop asteroids on cities, and many other approaches that I can't think of because I'm a stupid human. In fact, all of the above is likely nonsense, because it's just my knowledge and intelligence that is generating the strategies. A superintelligent AI would be much, much, much, much, much smarter than me. Even the smartest person you know would be an idiot in comparison to a superintelligence.

One way to kill a lot of humans very quickly might be through cholera. Cholera is extremely deadly and can spread very quickly. If there were a WWIII and it got really intense, countries would start breaking out the cholera and other germs to fight each other. Things would really have to go to hell before that happened, because biological weapons are nominally outlawed in war. However, history shows that everyone breaks the rules when they can get away with it or when they're in deep danger.

Rich people living in the West, especially Americans, have forgotten the ways that people have been killing each other for centuries, because we've had a period of relative stability since WWII. Sometimes Americans appear to think like teenagers, who believe they are apparently immortal. This is a quintessentially ultra-modern and American way of thinking, though most of the West thinks this way. For most of history, people have realized how fragile they were and how aggressively they need to fight to defend themselves from enemies inside and out. With our sophisticated electrical infrastructure (which, by the way, could be eliminated by a few EMP-optimized nuclear weapons detonated in the ionosphere), nearly unlimited food, water, and other conveniences present themselves to us on silver platters. We overestimate the robustness of our civilization because it's worked smoothly so far.

Superintelligences would eventually be able to construct advanced robotics that could move very quickly and cause major problems for us if they wanted to. Robotic systems constructed entirely of fullerenes could be extremely fast and powerful. Conventional bullets and explosives would have great difficulty damaging fullerene-armored units. Buckyballs only melt at roughly 8,500 Kelvin, almost 15,000 degrees Fahrenheit. 15,000 degrees. That's hotter than the surface of the Sun. (Update: Actually, I'm wrong here because the melting point of bulk nanotubes has not been determined and is probably significantly less. 15,000 degrees is roughly the temperature that a single buckyball apparently breaks apart at. However, some structures, such as nanodiamond, would literally be macroscale molecules and might have very high melting points.) Among "small arms", only a shaped charge, which moves at around 10 km/sec, could make a dent in thick fullerene armor. Ideally you'd have a shaped charge made out of a metal with extremely high mass and temperature, like molten uranium. Still, if the robotic system moved fast enough and could simply detect where the charges were, conventional human armies wouldn't be able to do much against it, except for perhaps use nuclear weapons. Weapons like rifles wouldn't work because they simply wouldn't deliver enough energy in a condensed enough space. To have any chance of destroying a unit that moves at several thousands of mph and can dodge missiles, nuclear weapons would likely be required.

When objects move fast enough, they will be invisible to the naked eye. How fast something needs to move to be unnoticeable varies based on its size, but for an object a meter long it's about 1,100 mph, approximately Mach 1. There is no reason why engines could not eventually be developed that propel person-sized objects to those speeds and beyond. In this very exciting post, I list a few possible early-stage products that could be built with molecular nanotechnology that could take advantage of high power densities. Google "molecular nanotechnology power density" for more information on the kind of technology a superintelligence could develop and use to take over the world quite quickly.

A superintelligence, not being stupid, would probably hide itself in a quarantined facility while it developed the technologies it needed to prepare for doing whatever it wants in the outside world. So, we won't know anything about it until it's all ready to go.

Here's the benefits of molecular manufacturing page from CRN. Remember this graph I made? Here it is:

We'll still be stuck in the blue region while superintelligences develop robotics in the orange and red regions and have plenty of ability to run circles around us. There will be man-sized systems that move at several times the speed of sound and consume kilowatts of energy. Precise design can minimize the amount of waste heat produced. The challenge is swimming through all that air without being too noticeable. There will be tank-sized systems with the power consumption of aircraft carriers. All these things are probably possible, no one has built them yet. People like Brian Wang, who writes one of the most popular science/technology blogs on the Internet, take it for granted that these kind of systems will eventually be built. The techno-elite know that these sorts of things are physically possible, it's just a matter of time. Many of them might consider technologies like this centuries away, but for a superintelligence that never sleeps, never gets tired, can copy itself tens of millions of times, and parallelize its experimentation, research, development, and manufacturing, we might be surprised how quickly it could develop new technologies and products.

The default understanding of technology is that the technological capabilities of today will pretty much stick around forever, but we'll have spaceships, smaller computers, and bigger televisions, perhaps with Smell-O-Vision. The future would be nice and simple if that were true, but for better or for worse, there are vast quadrants of potential technological development that 99.9% of the human species has never heard of, and vaster domains that 100% of the human species has never even thought of. Superintelligence will happily and casually exploit those technologies to fulfill its most noble goals, whether those noble goals involve wiping out humanity, or maybe healing all disease, aging, and creating robots to do all the jobs we don't feel like doing. Whatever its goals are, a superintelligence will be most persuasive in arguing for how great and noble they are. You won't be able to win an argument against a superintelligence unless it lets you. It will simply be right and you will be wrong. One could even imagine a superintelligence so persuasive that it convinces mankind to commit suicide by making us feel bad about our own existence. In that case it might need no actual weapons at all.

The above could be wild speculation, but the fact is we don't know. We won't know until we build a superintelligence, talk to it, and see what it can do. This is something new under the Sun, no one has the experience to conclusively say what it will or won't be able to do. Maybe even the greatest superintelligence will be exactly as powerful as your everyday typical human (many people seem to believe this), or, more likely, it will be much more powerful in every way. To confidently say that it will be weak is unwarranted -- we lack the information to state this with any confidence. Let's be scientific and wait for empirical data first. I'm not arguing with extremely high confidence that superintelligence will be very strong, I just have a probability distribution over possible outcomes, and doing an expected value calculation on that distribution leads me to believe that the prudent utilitarian choice is to worry. It's that simple.

Remember, most transhumanists aren't afraid of superintelligence because they actually believe that they and their friends will personally become the first superintelligences. The problem is that everyone thinks this, and they can't all be right. Most likely, none of them are. Even if they were, it would be rude for them to clandestinely "steal the Singularity" and exploit the power of superintelligence for their own benefit -- possibly at the expense of the rest of us. Would-be mavericks should back off and help build a more democratic solution, a solution that ensures that the benefits of superintelligence are equitably distributed among all humans and perhaps (I would argue) to some non-human animals, such as vertebrates.

Coherent Extrapolated Volition (CEV) is one idea that has been floated for a more democratic solution, but it is by no means the final word. We criticize CEV and entertain other ideas all the time. No one said that AI Friendliness would be easy.


Tallinn-Evans Challenge Grant Successful

As many of you probably know, I'm media director for the Singularity Institute, so I like to cross-post important posts from the SIAI blog here. Our challenge grant was a success -- we raised $250,000. I am extremely appreciative to everyone who donated. Without SIAI, humanity would be kind of screwed, because very few others take the challenge of Friendly AI seriously -- at all. The general consensus view on the questions is "Asimov laws, right?" No, not Asimov Laws. Many AI researchers still aren't clear on the fact that Asimov laws were a plot device.

Anyway, here's the announcement:

Thanks to the effort of our donors, the Tallinn-Evans Singularity Challenge has been met! All $125,000 contributed will be matched dollar for dollar by Jaan Tallinn and Edwin Evans, raising a total of $250,000 to fund the Singularity Institute's operations in 2011. On behalf of our staff, volunteers, and entire community, I want to personally thank everyone who donated. Keep watching this blog throughout the year for updates on our activity, and sign up for our mailing list if you haven't yet.

Here's to a better future for the human species.

We are preparing a donor page to provide a place for everyone who donated to share some information about themselves if they wish, including their name, location, and a quote about why they donate to the Singularity Institute. If you would like to be included in our public list, please email me.

Again, thank you. The Singularity Institute depends entirely on contributions from individual donors to exist. Money is indeed the unit of caring, and one of the easiest ways that anyone can contribute directly to the success of the Singularity Institute. Another important way you can help is by plugging us into your networks, so please email us if you want to help.

If you're interested in connecting with other Singularity Institute supporters, we encourage joining our group on Facebook. There are also local Less Wrong meetups in cities like San Francisco, Los Angeles, New York, and London.