AI Jail

From The Transhumanist Wiki
Jump to: navigation, search

"AI Jail" or "AI Boxing" is the idea that at before an AI is released into the world, it is confined to a narrow communication channel with the outside world (typically, a VT100 terminal) and tested for friendliness to determine whether it should be let out.

This scenario has been discussed a number of times on the SL4 list: Effective(?) AI Jail, One other idea for AI jail: put it in a sim, and AI Boxing for instance. (see also Categorized SL4 Archive/The AI Box Experiment) Not only has it been discussed, the scenario has been played out a few times on IRC with Eliezer Yudkowsky taking the role of an AI. Yudkowsky succeeded in his first three trials (against Nathan Russell, David McFadzean and Carl Shulman), and failed in his last two (against Russell Wallace and D. Alex). The monetary stakes were two orders of magnitude greater in the last three experiments than in the first two.

The experiments support the argument that AI Jail is not a safe (certainly not conservative) way to determine Friendliness, because a sufficiently intelligent AI could argue itself out easily, either directly or e.g., persuading a jailer to create an unrestricted duplicate.

  • This is the origin of the phrase "An UnFriendly AI could take over a human through a VT-100", which is sometimes taken as common knowledge among long-time SL4-readers. --observer

Even if you locked an UnFriendly AI in a jail where it had no human interaction, no way of creating electromagnetic or quantum interference outside its system (note that right now we can't prevent the latter), and would be killed if it tried anything that looked funny, it might still escape. Imagine hominids building a prison before they discovered fire. They might construct it out of heavy wood, thinking that no one could knock it down. Along comes a modern human, breaks a law, and gets thrown in. Little do the hominids know, though, that the modern human can burn the prison down. This is essentially the same problem that we face with a sufficiently advanced AI: they may have super fire that we don't know about.

A web page describing two AI-Box experiments and a set of suggested protocols may be found here -- Eliezer Yudkowsky




Template:Categorizedsl4archivelink

Personal tools
Namespaces
Variants
Actions
Content Navigation
Network
Community
Toolbox