Friendly AI Questions — Have You Asked One of These? Friday, May 2 2008
friendly ai 5:08 pm
If you’re interested in matters of AI friendliness, I strongly suggest you read the Creating Friendly AI FAQ. This is an appendix of one of the first (and finest) pieces of work to really take a close look at the problem.
One of my favorite answers is this one:

Can’t beat logic like that. Anyway, that’s the only answer that’s pithy, the rest are extremely well-written and thought out.
Here are the questions covered:
1. How is it possible to define Friendliness?
2. Isn’t all morality relative?
3. What if your definition of ‘Friendliness’ is distorted by your unconscious preconceptions?
4. Who are you to decide what ‘Friendliness’ is?
5. Won’t AIs necessarily be [insert some quality just like the human version]?
6. Isn’t evolution necessary to create AIs?
7. Even if AIs aren’t evolved, won’t they still be just like humans?
8. Even if AIs aren’t evolved, won’t they still be selfish?
9. Even if AIs aren’t evolved, won’t they still have self-serving beliefs?
10. Even if AIs aren’t evolved, won’t they still have analogues of pain and pleasure?
11. Won’t AIs decide to serve their own goals instead of humanity’s?
12. Won’t a community of AIs be more efficient than a single AI?
13. Aren’t individual differences necessary to intelligence? Isn’t a society necessary to produce ideas? Isn’t capitalism necessary for efficiency?
14. Isn’t a community [of AIs, of humans] more trustworthy than a single individual?
15. How do you keep a self-modifying AI from modifying the goal system?
16. Won’t an AI decide to just bliss out instead of doing anything useful?
17. What happens if the AI’s ’subgoals’ overthrow the ’supergoals’?
18. But…
19. Is Friendly AI really a good idea?
20. Have you really thought about the implications of what you’re doing?
21. What if something goes wrong?
22. What if something goes wrong anyway?
23. Do all these safeguards mean you think that there are huge problems ahead?
24. Would it be safer to have an uploaded human, or a community of uploaded humans, become the first superintelligence?
~~~
Read and learn! Once we create a self-improving seed, we’ll likely be stuck with it forever, so it’s important that we make the right decisions about how to program it right now.

May 3rd, 2008 at 7:02 am
I answered a few at random:
7. Even if AIs aren’t evolved, won’t they still be just like humans?
8. Even if AIs aren’t evolved, won’t they still be selfish?
9. Even if AIs aren’t evolved, won’t they still have self-serving beliefs?
Probably not, don’t know, and so what. Humans are partially a creation of there senses (how we experience the world) and their environment; and AI would presumably be quite different; even if an intelligence equivalent to a humans was created; it would probably ‘evolve’ much different. Any sentient being will have needs and goals; true AI would essentially be another lifeform.
Self- Serving beliefs are not necessarily a bad thing, just like in economics; my grocery store has self-serving goals, so do I, but we both benefit. The trick is not having interests conflict, that’s the important part of the equation (If we make AI our slaves, our interests will most certainly conflict.)
11. Won’t AIs decide to serve their own goals instead of humanity’s?
Why is that a bad thing; why do he interests of an AI have to contradict with humanities? In economics, anyway, a mutual beneficial relationship has always been superior to a one-sided one.
Plus, If AI’s achieve true intelligence (sentience); wouldn’t it be immoral to force them to do our bidding, the rebirth of slavery.
14. Isn’t a community [of AIs, of humans] more trustworthy than a single individual?
For humans, at least, No; in many cases this is not true, but people many times feel less inhibited in groups, and more prone to letting the group do their thinking for them (kids being loud in a classroom is an example of this.) The mob mentality often sets in.
May 3rd, 2008 at 4:00 pm
One thing to bear in mind is that there are people diligently working towards creating unfriendly AIs right now, mostly in the military and with considerable financial backing but possibly also elsewhere. AIs capable of sophisticated data mining may be used by governments to rigorously enforce unethical policies which seek to harm, destroy or render “non persons” undesirable sections of the population. A future Adolf Eichmann may not be human, but a computer system planning and carrying out a set of beurocratic rules.
May 3rd, 2008 at 8:58 pm
I do worry a little about intentionally creating unfriendly AI; though those fears are probably not as valid as we think.
American soldiers in WWII, for instance, weren’t exactly “unfriendly”, they were just loyal to a certain group; in the same sense, an AI unfriendly might to be selectively unfriendly. Even a friendly AI might decide to be unfriendly to defend those it’s friendly to.
But if a dangerous AI is created in this way, then it is better of have more of them competing then just one or a few (this is counter-intuitive, but true.) An AI of any worth would presumably be able to make rational decisions, and if it’s competing with other AI’s, the decision to be aggressive should be abated. It’s the classic doctrine of Mutually Assured Destruction; quite simply, the bad AI’s have nothing to gain and everything to lose.
I hope this scenario never becomes a reality; I don’t look forward to another cold war.
May 4th, 2008 at 1:13 am
I have another one for you.
If researchers had come up with a strong AI in the 1950s, it potentially would reflect ideas and values of the 1950s which we would regard as parochial and outdated. An AI which has inhibitions and aims and emergent properties that stem from the value system of the 1950s might alienate the vast majority of liberal westerners of 2008. A “friendly” AI developed by Chinese communist party AI researchers, or a “friendly” AI developed by wahabi Islamic researchers might be even more distressing.
How can you make sure an AI which is researched in a particular parochial time doesn’t enforce a rapidly outdating or alien value system on the world?
May 4th, 2008 at 3:07 am
I guess all this pontification boils down to whether you believe that there are a certain set of human values which are universal across cultures and across time.
May 4th, 2008 at 1:09 pm
@Khannea Suntzu:
“How can you make sure an AI which is researched in a particular parochial time doesn’t enforce a rapidly outdating or alien value system on the world?”
I completely agree with you on this. I think that Eliezer also realizes this; he is very clever and has though things through fairly thoroughly.
The answer is actually implicit in the question: if you want an AGI that will not continually enforce an outdated value system on the world, you need to build an AGI which can change and learn from it’s mistakes, an AGI who can recursively and continually improve verself.
The 1950’s evolved into the 2000’s - that’s what has been happening in the last 50 years - so it is possible to build a system whose values change over time. The question is exactly how to do that.
May 4th, 2008 at 3:25 pm
One problem though is that according to Yudkowsky’s theory an AI will want to defend its top level goals from modification. Modification of the highest level goals would mean that they would not be achieved, which might be considered as very bad by the AI. If the top level goals can drift over time a self reflective AI might wonder whether those goals were even valid in the first place, and perhaps seek to replace them with something which can be more rigorously implemented. This is essentially the same quandary as depicted in the movie 2001: a space odyssey.
May 4th, 2008 at 4:10 pm
@Bob Mottram:
hmmm. I am definitely in favor of AGI with a motivational system that allows change and evolution of values. I certainly wouldn’t want to build a powerful AGI with forever-fixed “top level goals”. Exactly how value evolution should work is a question that I don’t think I can answer yet.
May 4th, 2008 at 5:58 pm
While I’ve got growing admiration over the long haul for Eli, despite some earlier dustups with him, this:
Q: Is Friendly AI a good idea? A: yes
…is, in a nutshell, one of the most infuriating things about trying to interact with him, converse with him, collaborate with him, or even learn from him. It’s got the benefit of being succinct, but it’s no less “logical” or circular — and no more helpful — than most of what he wrote on the topic at least through 2004. And this circularity is coupled with a kind of arrogant contempt that such things should even be questioned when dealing with, ahem, an “algernon” who spends all his time “staring into the Singularity.” It smacks of megalomania, a kind of messiah complex.
For my time and effort, I don’t particularly find the FAI FAQ — or most of his other, earlier work on the topic — either convincing or compelling.
Having said that, he’s raised his level of tolerability (IMHO, YMMV) to the minimum requisite point and much of his more recent writing is actually informative — though not as much of it is targeted at FAI per se.
$0.02,
jb
May 5th, 2008 at 1:16 am
Is hostile ultra-intelligence a realistic scenario? Intuitively, yes. But if ultra-intelligence entails a superhuman capacity for empathetic understanding, then perhaps not. Maybe malevolent superintelligence will turn out to be a contradiction in terms.
Arguably at least, hostility reflects ignorance [”tout comprendre, c’est tout pardonner”] - an intellectual limitation that ultra-intelligence will overcome.
On narrow, autistic definitions of superior intelligence, unfriendliness is clearly possible. Thus my computer chess software is “hostile”. Taking the intentional stance, I may say that the program wants to defeat me - and it usually succeeds. More seriously, the Pentagon’s “smart” weapon systems can defeat and literally destroy their victims. But such systems lack a theory of mind. They are more akin to idiot savants than truly intelligent agents, let alone full-spectrum superintelligence.
Of course, by itself possession of a theory of mind is no guarantee of friendliness. Thus our (comparatively) superior “mind-reading” skills may have driven the evolution of human intelligence. The outcome was a selfish “Machiavellian ape” rather than agents of impartial benevolence. So alas all sorts of nasty cognitive, perceptual and emotional biases have been selected for in human minds. Such adaptations helped maximise the inclusive fitness of selfish DNA in the ancestral environment. Yet as we acquire a deeper scientific understanding of our biases, we should be able to overcome such limitations. Indeed a superintelligence should be able to empathise with all possible perspectives and points of view - in god-like contrast with our own often hostile or indifferent treatment of other ethnic groups and species. Essentially, I’m arguing that truly “super-empathetic understanding” excludes unfriendliness to other sentient beings.
This note of scepticism on likelihood of hostile superintelligence isn’t intended to suggest we should be complacent on possible threats - both known and unknown. On the contrary, the transitional era between human and post-human intelligence is potentially the most deadly for life on Earth. But I don’t think we need fear mature superintelligence - just its early botched approximations. In the meantime, I think the biggest underlying source of global catastrophic risk is more mundane, namely the persistence of the Y chromosome.
May 5th, 2008 at 10:18 am
David: “Essentially, I’m arguing that truly “super-empathetic understanding” excludes unfriendliness to other sentient beings.”
- I’m not sure this is correct. Suppose I can empathize with you, see things from your perspective, etc. How does this stop me from killing you, for example?
A murderer could use his ability to empathize with his victim to more effectively anticipate what his victim will do and thus more efficiently kill her.
Not to mention the fact that superintelligent AI may not conform to the standard patterns of human behavior. Humans with more empathic ability are usually nicer, but I don’t see any solid evidence that we can apply the same reasoning to minds which may be quite alien.
David: “But I don’t think we need fear mature superintelligence”
- I’m kind of inclined to agree with this, but for different reasons than those that you stated, which I don’t have time to go in to…
May 5th, 2008 at 11:37 am
I would hasten to comment that the stronger the empathetic response is, the greater the likelihood of identifying with the subject of the empathetic response becomes.
Roko, have you ever read any case-study material (including summaries or synopses) of the mental state of serial killers? They have a strong tendancy to be sociopaths; and sociopaths by definition lack the ability to identify with their victims.
While it is not a guarantee against all forms of UnFriendliness (stagnation), a hyperdeveloped empathic response does make it more likely that the process of identification occur. And while this //is// the result of evolutionary psychology, we //know// that in humans people have an extremely tough time hurting things that they “identify” with. Call it a way of exploiting the anthropocentric principle:
Convince an emotional superintelligence that we are “like it”, and that we are “cuddly” (in whatever terms a seed-intelligence would perceive that to be) and we’re on the short line to the long haul, if you get my mixed metaphoring.
May 5th, 2008 at 2:53 pm
“Roko, have you ever read any case-study material (including summaries or synopses) of the mental state of serial killers? They have a strong tendancy to be sociopaths; and sociopaths by definition lack the ability to identify with their victims.”
- no, I haven’t. I’d be interested to hear more… I’d be interested to know why exactly the ability to identify with your intended victim makes it harder to brutally kill them. What’s the mechanism here?
Although the relevance (or otherwise) of this hinges on whether one can find a quick way to make a superintelligence that just happens to have all the nuances of human psychology. You could perhaps achieve this with Whole Brain Emulation (WBE) (Anders Sandberg has a nice post up on this.), or perhaps with BCI or a wholly “in software” approach - artificial persons - as I have advocated.
May 5th, 2008 at 3:24 pm
Roko, as you note, humans with more empathic ability are normally nicer. Is there a causal connection? If so, can one extrapolate? What would a future, god-like capacity to empathise entail?
I’d agree with you that - in certain circumstances - a greater capacity for empathy could make someone a more efficient murderer. Selective and imperfect empathy is no guarantee of universal benevolence. But non-coincidentally, murderers tend to be male, less empathetic and less intelligent than the population at large [there are of course lots of confounding variables here] So to take your example, one reason you wouldn’t murder me (aside from any imprudence) is that you can imagine the distress it would cause my family. Imagining/simulating their distress involves activating your mirror neurons in ways you’d find unpleasant. [interestingly, there seem to be differences in the mirror neuron system of men and women: see Gender differences in the human mirror system: a magnetoencephalography study. Neuroreport. 2006 Jul 31;17(11):1115-9.]
I’d again agree with you: there is no guarantee an alien mind would be benevolent. Yet if it’s a superintelligent alien mind, may we assume that it’s smart enough to imagine what it feels like to be another kind of sentient being? By contrast, if the alien intellect is “mind-blind”, then in one sense, it’s dumb - and presumably indifferent to (and uncomprehending of) our interests.
Despite our ignorance of the basis of consiousness, what it feels like to be you, me (or a bat) is presumably as much a natural property of the world as, say, the atomic number of gold. So if the hypothetical alien (or “advanced AGI”) mind doesn’t understand the atomic properties of matter - or what it feels like to be organic life-forms - then it’s hard to see how it could be described as superintelligent. For in this (hypothetical) case, the alien mind is ignorant of an important part of the fabric of reality, notably phenomenal pain and pleasure. It may of course have information-processing capacities we lack.
I should add that I’m only advancing the superempathic = superintelligent = superfriendly conjecture as a provisional working hypothesis. I’m aware it sounds a bit naive.
May 5th, 2008 at 5:06 pm
IConrad: Convince an emotional superintelligence that we are “like it”, and that we are “cuddly”
…and it tiles the universe with cuddly nano-mannequins, or whatever makes it feel good without taking up as much space as humans. A really complex and less-than-perfectly-reliable emotional response seems like a bad ground for Friendliness.
David Pearce: On narrow, autistic definitions of superior intelligence, unfriendliness is clearly possible.
Should I be offended?
Yet another reason to say “optimization process” instead of “intelligence”.
May 5th, 2008 at 6:01 pm
Nothing personal like, but it’s a habit of mine to take exception to people trying to link capitalism with innovation/efficiency/whatever when they’re running on Linux, PHP, MySQL, Wordpress etc.
And not Windows, ASP, SQL-Server etc etc.
The future doesn’t start becoming the future until it’s open-sourced.
May 6th, 2008 at 5:49 am
The future doesn’t start becoming the future until it’s open-sourced.
Eh, the future becomes the future on its own terms, not yours, but yeah, open source stuff is nice. For some projects, though, a bunch of unpaid volunteers working purely on merit is utterly useless. Try using just volunteers to get serious work done and you’ll see what I mean.
May 6th, 2008 at 10:04 am
You know, Michael, the folks over at TransGaming, CrossOver, Canonical, et al. just might take umbrage at the idea of being unpaid volunteers.
After all, open source is the source of their paycheck.
May 6th, 2008 at 10:20 am
No, no it doesn’t. Because those wouldn’t be the humans that it has identified with and therefore wishes to preserve, seeing as to how if you drive the empathic response high enough in any entity with a self-preservation goal (as any viable seedAGI would have as, likely, a secondary goal at least) said entity would be driven to preserve humans.
Which aren’t cuddly nanobots. You responded to the flavor and not the meat, man. That’s misguided when dealing with complex topics that have been //summarized// via metaphor.
It’s really not all that complex. And it’s also a guaranteed property that we’ll understand in-depth //long// before we ever get a sufficient theory-of-mind to create a seedAGI.
May 6th, 2008 at 11:46 am
“Try using just volunteers to get serious work done and you’ll see what I mean.”
My organization, the Boy Scouts of America, depends on the hard work and dedication of our volunteers for nearly everything. Now we’re about ready to kick off a new open-source initiative.
I’ll let you know how it works out.
May 6th, 2008 at 11:50 am
Boy Scouts of America is an unusual exception, because, they actually like, care about service to your community and stuff. Of course, children are sort of a captive audience, because they can’t really run away once their parents drop them off.
Parents care about their children, so they get involved in the organization. This is also why people actually participate in the PTA.
It’s somewhat similar to religious organizations… yes, they include many “volunteers”, but they’re partially doing it to get into Heaven. Not to say that caring about your children or doing good works is a bad thing.
May 6th, 2008 at 11:54 am
IConrad: OK, I think I see what you mean, but I don’t see why you think identification and empathy will be likely to appear in an AI; they sure look like complex functional adaptations, not features of minds-in-general (would a paperclip maximizer empathize?). Or why they would be easier to build in than a more formal model of Friendliness, one capable of accomodating the kind of strong guarantee that you really want before launching a superintelligence.
May 6th, 2008 at 2:14 pm
I don’t. I think they could be engineered into an AI.
May 6th, 2008 at 4:52 pm
A lot more easily than something like CFAI?
May 6th, 2008 at 9:20 pm
…is, in a nutshell, one of the most infuriating things about trying to interact with him, converse with him, collaborate with him, or even learn from him.
I think you’re taking this too far… there are many more quotes that grasp the frustration far better! Hahahaha. The man can be a comedian.
It’s got the benefit of being succinct, but it’s no less “logical” or circular — and no more helpful — than most of what he wrote on the topic at least through 2004.
It’s not meant to be much… the rest of the FAQ is the substance.
And this circularity is coupled with a kind of arrogant contempt that such things should even be questioned when dealing with, ahem, an “algernon” who spends all his time “staring into the Singularity.” It smacks of megalomania, a kind of messiah complex.
Contempt, no. Joking, yes. Algernon, I’m not sure. The megalomania is a joke. Otherwise, why bother with all this CEV stuff? There are faster ways.
I’ve spent my whole life in the Bay Area with some extremely smart people, but this is something else. The only way you can dismiss it is by saying intelligence doesn’t matter. (It does.)
For my time and effort, I don’t particularly find the FAI FAQ — or most of his other, earlier work on the topic — either convincing or compelling.
Much of the material is echoed by folks with whom you have no personal baggage.
Having said that, he’s raised his level of tolerability (IMHO, YMMV) to the minimum requisite point and much of his more recent writing is actually informative — though not as much of it is targeted at FAI per se.
AGI is 99% of the problem, FAI is 1%. Would be nice to have that 1% squared away, though. I have my own ideas about that, you know… *yawn*.
May 7th, 2008 at 11:59 am
1: Engineered empathy is an approach to CFAI.
2: I’d say that it would be vastly easier. After all, it’s already well-defined. Can the same be said of other computationally Friendly AI approaches?
I wonder:
June 16th, 2008 at 5:37 pm
“[The dignity of the human species] will be completely destroyed [if the population growth continues at its present rate]. I use what I call the bathroom metaphor: if two people live in an apartment and there are two bathrooms, then both have freedom of the bathroom. You can go to the bathroom anytime you want to stay as long as you like for whatever you need. But if you have twenty people in the apartment and two bathrooms, no matter how much every person believes in freedom of the bathroom, there is no such thing. You have to set up times for each person; you have to bang on the door, “Aren’t you done yet?” In the same way, democracy cannot survive overpopulation. Human dignity cannot survive. Convenience and decency can’t survive. As you put more and more people onto the world, the value of life not only declines, it disappears. It doesn’t matter if someone dies, the more people there are, the less one person matters. – I.A. as guest of Bill Moyers on PBS”
http://en.wikiquote.org/wiki/Isaac_Asimov
It seems crucial to the survival of our species, that A.I. should ensure the prevention of a scenario in which the purest, most transparent form of democracy, is denied by some defective ruling or governing structure. It should be a given that a functional democracy provides us with the best platform, to date, to ensure that the wisest, most well-founded, and intellectually sound [human advancement-oriented] ideas can compete, and ultimately rise to the top.
Traditions are o.k. to a point and loyalty is even better, but not at the expense of integrity. If integrity and decency are not higher priorities in the matter of governing and overseeing the advancement of our species, then at that point, traditions become poisonous, and loyalty to them, unforgivably catastrophic.
This where trust enters the equation. Some very ancient, archaic, and unenlightened perceptions of what trust and friendliness are still persist to this day. However, if real trust and friendliness exist,.. checks, balances, and transparency can actually flourish, making a conceptual A.I. ruling system (in the form of democracy) even more efficient at expanding the boundaries of life, all while preserving its’ dignity.
While some heightened degree of respect for the past, and all that which has sustained us must exist…, it should not be permitted to horde ALL of what is to be perceived as important and true. As our species matures, it is important to be vigilant of those whose pursuits of truth cling too tightly to the past, because it’s easily demonstratable that more truth, science, art exists ahead of us than behind. Smooth paradigm shifts require love, compassion, and an ever-evolving understanding of one-another. Got a problem with that? Many experience a great sense of loss, when the future blasts their preconceptions of what is real, and stable, and normal, to smithereens. Deep-rooted insecurities emerge and hopelessness and despair become very real dangers to their well-being (and those around them). Fortunately, this can be offset to a surprising degree with an intervention of hope. If a global democracy is to ever exist (assemble like Voltron), the individual ones need to be adaptable enough to help “weather the storm,” in the time being, of those in paradigm-shift related distress. ..Such as those who would have hang-ups about making widely available, reversible sterilization products to those who want them. This would do much to help increase dignity, so that it could at least approach something more decent. [ex. http://www.freepatentsonline.com/4682592.html ]