IanC, a newcomer to this site making thoughtful comments, is just as good as example as any of a typical intellectual first coming upon the topic of Friendly AI. Here is a comment he recently made on the topic:

I’m too new to investigating transhumanist thought in general to gainsay you much… but I will say that I cannot easily conceive of a scenario in which a self-actualizing sentience (spontaneous awareness, or what have you) would be inherently incapable of empathy; and any ‘designed’ self-reformatting “AI” would inevitably be able to ‘code’ out of itself (I personally think that thinking in terms of code won’t ever get us to the true AI stage) any built-in “Friendliness” and might even come to resent those whom had so embedded it.

To me, the trick is not so much to concern ourselves with the *engineering* of a ‘friendly’ AI, but to ensure that any AI that develops is a ‘holistic’ sentience — i.e.; capable of emotions and empathy in a sane, stable manner from ‘conception’ — and then act to ensure it is empathetic to humanity.

All that really takes is being kind to the ‘god child’ that such an engine of awareness/manufacturing could be.

At least, that’s my personal take on it. The key, either way, is in empathy. Without having a very strongly developed sense of empathy, no superior sentience will ‘dote’ on its lessers.

I find myself more deeply concerned about the *freedom* of humanity to control its own destiny in a ‘post Singularity’ world. I’m sure this is something that’s been addressed ad nauseum, but I haven’t seen it.

With all due respect to IanC, the opinion expressed is classic anthropomorphism. Anthropomorphism is what happens when we take the part of our mind that models other minds (traditionally, always other humans), and apply it to nonhuman minds, which our brain completely lacks the adaptations to model intuitively.

The human mind has a certain type of complex structure, one that has been studied in such detail by the cognitive science communities, that if you spent your entire life reading the entire literature, you’d never finish more than 1% of all the work published thus far. This corpus catalogs all the interesting, relevant, and mostly species-unique characteristics of thought and action of a particular intelligent Earth ape called Homo sapiens sapiens. Our particular psychology is the consequence of our millions-year long evolutionary history and unique evolutionary challenges that our ancestors faced. Because it is so ubiquitous, we often think that our unique psychology applies to every intelligent being in the entire multiverse, including AIs that we build on models significantly different than the human brain.

So when a typical smart person first confronts the human species life-or-death problem of Friendly AI, they think that using the same strategies we use to produce nice humans will work with AIs, like being nice to the AI.

Conditional niceness is a specifically human quality. We’re a black box that when nice goes in, nice is more probable to come out. The inverse also applies.

However, this conditional niceness is a evolutionary survival trait. Being nice to a hostile tribe is suicidal. Being nice to a allied tribe can be beneficial. In both cases, the behavior is gene-programmed for the maximization of inclusiveness fitness. Sometimes it seems like we can derive these “logical principles” from a blank slate and minimal assumptions, but the fact is, they’re not “logical principles”, they’re behavioral tendencies sculpted by the distinct course of natural selection here on Earth.

We can concretely say that we want Friendly AI to display unconditional niceness. There: that very statement just eclipsed the thousands of pages of thinking by people who approach the problem of AI assuming that conditional niceness is a fundamental quality of all “self-actualizing sentience”, or whatever you want to call general intelligence, and that is has to be worked around rather than simply not coded in to begin with.

Whenever someone postulates AIs spontaneously developing some uniquely human psychological quality whose origin is an evolutionary selection pressure, they are committing anthropomorphism, and failing to contribute constructively to the dialogue on Friendly AI.

Human emotions have an extremely complex, species-unique structure and only anthropocentrism could cause someone to postulate that they are necessary to implement AI. The whole discussion about adding emotions to androids is about making humans feel comfortable, it has little to do with implementing the functionality of general intelligence (which no one in robotics in even trying to work on, to my knowledge). Thinking that our particular emotions are necessary to intelligence in general is like saying that the complex feather patterns of birds are necessary to build something that will fly.

If we build an AI that is something (for example, pursues a certain supergoal), we have little reason to assume it would rewrite itself to change that, because that’s who it is. In humans, we have a complex suite of competing goals that change gears spontaneously, at the provocation of stimuli, or just at random, because our brain design is messy and biological. For AI, we’ll be able to build “causally clean goal systems”, goal structures where desirability backpropagates to intelligently crafted subgoals from a programmer-written supergoal, and subgoals derive their desirability merely from their association to the supergoal. If a subgoal is found to contribute little or nothing to the central supergoal, such an AI would not keep performing it out of habit like humans do, but simply stop as soon as the necessary values are updated in the system.

Of course, I’m not saying that the first AI will definitely have a causally clean goal system. It just seems more predictable and easier to engineer, and can benefit more directly from the large literature on probability theory and causal inference. It is also preferable (probably) from the standpoint of Friendliness.

As for the issue of humanity’s freedom to control its destiny, it certainly is a concern, but if we’re only imagining a superintelligent AI as a government figure with an innate desire to control our affairs, then we’re making the same anthropomorphic error all over again. If an AI restricts our freedom, it will likely be because the first programmers intended it that way, or an unforeseen consequence of the supergoal results in restriction of what we call “freedom”, not because the AI feels behaving like the Gestapo.

Further reading:

Beyond anthropomorphism
Friendly AI on Accelerating Future