Disagreements Between Pro-and Anti-Friendly AI Transhumanists Tuesday, Feb 9 2010
friendly ai 9:20 pm
One of the greatest divisions between pro-Friendly AI transhumanists and anti-Friendly AI transhumanists may be a disagreement about whether unconditional kindness is physically possible. In a recent comment at the IEET, James Hughes said in response to Kaj Sotala:
I also do not believe in the possibility of a super-AI of the type you imagine capable of doing these tasks which did not have some kind of self-interest, or was not programmed to serve the interests of some group more than others. I think the notion of such a purely altruistic creatures is sublimated religion.
I’m not so convinced, but do note that SIAI threw out the idea of normative altruism as a goal system for Friendly AI some time ago, and replaced it with Coherent Extrapolated Volition (CEV). Still, I consider it plausible that the CEV output will result in some version of an unconditionally altruistic agent, so the question is important.
In a comment I made to the IEET that never appeared on the page (due to spam filter issues), I pointed out that our level of altruism towards other beings is roughly contingent upon how much shared genetic material we have with them. This is called kin selection, and is best expressed by the population genetics joke where one person asks another, “Would you give your life for your brother?”, and the other responds, “No, but I would give my life for four nephews or for eight cousins.”
From the Wikipedia page on inclusive fitness:
From the gene’s point of view, evolutionary success ultimately depends on leaving behind the maximum number of copies of itself in the population. Until 1964 it was generally believed that genes only achieved this by causing the individual to leave the maximum number of viable offspring possible. However, in 1964 W. D. Hamilton proved mathematically that because close relatives of an organism have some replica genes, the gene can also increase its evolutionary success by promoting the reproduction and survival of these related or otherwise similar individuals. This leads individuals to behave in a manner maximizing their inclusive fitness, rather than their individual fitness.
I suspect that Dr. Hughes may be almost a half-century behind the times in that he considers the notion of a purely altruistic being to be “sublimated religion”. After all, eusocial insects, and the two eusocial mammals, two species of mole rats, have taken kin selection to such an extreme that they will engage in self-sacrificial behavior to protect their colony. This is because genes only “care” about perpetuating copies of themselves — if a unique copy is only found within one individual, then individuals in that species will be self-interested, but if genetic material is shared to a high extent among the group, then selfless social behavior will evolve.
At the moment, I can conceive of several objections for why this behavior could not extend, even in principle, to superintelligence for humanity.
1. Superintelligence and humanity would be on two different levels, whereas eusocial insects and the like are of the same species.
2. For some reason, for organisms more intelligent than mole rats, eusociality cannot evolve, even in principle. There is just something about high intelligence that is inherently antagonistic to unconditional kindness. Perhaps it has to do with unconditional kindness being inherently “dumb” in some way.
3. Humans are inherently nasty creatures such that even a being programmed to love us would find it impossible to do so.
4. AI cannot do X, because AI lacks the magic juice that makes altruism possible in some humans and other organisms.
Maybe there are better objections than the above, but that’s just what I came up with off the top of my head. Note that at the moment I’m just addressing the feasibility of universal altruism question rather than whether it is practical to program. I have actually heard some version of all the above objections, so they are not straw men, though perhaps I am rewording them uncharitably for what I consider to be clarification.
Let me respond to each in turn. Claim #1 has evidence both for and against. Obviously, we have no superintelligence in front of us, so testing the claim explicitly is impossible. But I disagree with Vinge that no evidence can ever change our opinion on the issue. That is a topic worthy of another post, and I encourage anyone who is reading to think about predictions of the behavior or needs of superintelligence that we can make with a relatively high confidence. For instance, that superintelligence would need energy. Therefore, superintelligence would have to engage in some energy-seeking behavior. Bam, the “unpredictability horizon” hypothesis is false.
There is at least one piece of evidence mildly against claim #1, that too large of a gap would render eusociality impossible. Notice the gap of power between soldier ants and drones or queens. The soldier ants in a colony could easily kill drones or queens, but don’t, because of evolutionary motivations sculpted by kin selection. The same can apply to the power differential between parents and children. Parents could kill their children, but mostly don’t, because of evolutionary motivations. Observe how step-parents are more likely to abuse their children: why? Probably because the child is not genetically derived from the adult and therefore the adult has less evolutionary motivation to preserve or help the child.
The flow of kindness in ecological systems is obviously crafted by evolution, and can be estimated quantitatively using the concept of inclusive fitness. Some aggressive fish refrain from eating other, smaller fish that clean them because they have a symbiotic relationship. Symbiotic relationships evolve in all sorts of places. They exist because of cognitive programming and evolutionary pressures. The multiplicity of examples of “kindness” in nature makes a strong case that an agent that is universally altruistic is probably possible. The way that such kindness evolved through evolution also provides a strong argument that we will eventually duplicate it on our own, through reverse-engineering or abstracting the appropriate dynamics.
I will refrain from going into #2-4 because I am afraid of them being called straw arguments. If any of the commenters believe in any of these points, or have other ones to share, I encourage you to do so.

Hi Michael,
Even I made a comment on that thread which didn’t go through.
I guess the main point has to be that an AI on deterministic hardware is a computer program that does whatever is coded, not what we want, but what is coded. It is our problem on how to define the requirements and translating them well to code. Concepts of self and others will enter this only when we include mathematical constructs for self and others in the requirements and then later, in the code (or heaven forbid, in one of the first input parameter(s) in a do-what-is-instructed AI)
“did not have some kind of self-interest”
A well programmed AI will have some kind of self interest as a sub-goal, not necessarily as a main goal.
“was not programmed to serve the interests of some group more than others.”
That is very true. Depending on how you define “group”. Even CEV ends at the shore of humanity. You have to draw the line somewhere.
It’s possible that reciprocal altruism is a better evolutionary-psych model for what you have in mind since this is claimed to account for co-operation as a stable strategy in genetically unrelated populations (?!).
Actually, I’m inclined to see the evolutionary psych approach as beside the point here: if for no other reason that we won’t share a common genetic inheritance with AI’s so we better be able to cultivate altruism on some basis such as shared culture or social existence. This article seems to represent the kind of genetic-determinism that seems to be increasingly abandoned in favour of a more epigentic model of inheritance that sees cultural niche construction as having parity with genetic information.
First, calling people or their actions “Nazi” is bad form, no matter how much you may disagree with them.
Second, we have never to my knowledge disallowed a comment on the IEET site from Michael Anissimov. If a comment he submitted did not get through, it’s more likely a captcha problem than a moderation issue.
Third, we do practice firm moderating of comments and do so unapologetically. Although we are happy to engage alternative points of view, we draw the line when commenters are: a) blatantly irrational or anti-scientific, like those who spew global warming denialist nonsense; b) unproductively argumentative, acting like trolls as opposed to engaging in constructive debate; or c) engaging in personal attacks, name-calling, being disrespectful, etc.
I hope that makes clear how we conduct our business.
Thanks Mike. I will change the text to make it clear that I’m not implying it was deliberately removed, plus I’ll remove Tim Tyler’s inappropriate comment.
Michael
If your example of a purely selfless creature is a worker drone then we are indeed talking past one another on several levels.
I do believe it is possible for their to be expert systems which facilitate human communication and decision-making without imposing any goals of their own.
I do not believe that is what is intended when your group talks about “artificial general intelligence” which is supposed to be not only self-aware at a human level, but inconceivably more complex and powerful.
Your proposal is that if you start with “kernel code” that is as selfless as a worker drone or iPhone app that it will remain so when it becomes godlike.
I don’t buy it, and neither do most other people who hear the idea. It is, as I’ve said, a form of displaced religious faith in the purity and immutability of good code. In the beginning was the Code, and the Code was good…
James,
Say that you place some amount of probability on a hard takeoff from the first superintelligence, say 5%.
Say that you aren’t sure that the superintelligence will lead to a hard takeoff, but to be “conservative”, you assume that it will, so you take as many precautions as you can.
You nominally have two choices: AI or IA?
I tentatively welcome either transition as long as the first superintelligence has human interests deeply in mind.
I so happen to think that AI superintelligence is probably easier than IA superintelligence, so it is in my best interest to maximize the probability that said AI superintelligence at least starts off human-friendly.
Even if we have no long-term control, we have control over the starting point.
I applaud anyone who is interested in making human-friendly IA superintelligence, but I don’t see that strong a movement in that direction, currently. Many people in SIAI are interested in IA and keep a close eye on it, so it’s the best place to be for those concerned about both IA and AI superintelligence.
Maybe the “Code” will fail, and will lead to our destruction. The goal of the Friendly AI movement is to increase our understanding as much as possible and promote the creation of seed AI with human-friendly initial motivations. If there is some cosmic force that automatically transforms human-friendly motivations into human-unfriendly motivations during the self-improvement process, then we are doomed either way. But, if human-friendly motivations give rise to self-modification choices that preserve the human-friendly utility function, then we will be in good shape.
The question boils down to: which would you rather have the first superintelligence be, AI or IA? Either you think the question doesn’t matter all that much, or you may have some preference. My preference is for AI, for a lot of reasons, but it’s unfair to imply that we are traitors to the human race just because we are working towards an AI Singularity. The very reason we want an AI Singularity to begin with is that we consider it the easiest way to preserve human values across the transition.
The background perspective for all of this is Bostrom’s Future of Human Evolution paper.
Why not? We know why humans get more selfish when they get power: because humans are programmed to pass on their genes, be subservient when weak, and destroy their enemies when they have the chance. Hence beta males in chimp clans sometimes band together and kill the alpha. The “more power = more selfishness” connection makes sense for Darwinian organisms, but we have specific mental routines that drive this behavior. Why do you think these mental routines would emerge de novo in an AI specifically uninterested in them? Wouldn’t they have to be deliberately programmed in? Otherwise, where would they come from? Remember that the AI has complete control over its own source code — it can enforce tyrannical control over its own mental content. Do you think it would just sublimely slip into another state of mind without even knowing it?
Are you familiar with the idea of the Blank Slate? I think that you, Mike Treder, and some others in H+ might have a Blank Slate view of intelligence, where mental properties unique to human minds are assumed to be properties of minds-in-general.
Another disagreement of ours seems to be around the ethics of building a selfless superintelligent Transition Guide to begin with. We don’t see it as ethically troublesome to build a selfless superintelligence, but you seem to imply that it’s both 1) unethical, and 2) extremely difficult. If it’s so difficult as to be impossible, why bother with condemning it ethically? Please clarify.
A few thoughts on “friendliness”…
It’s safe to say that most human ideas on personal (non)identity are confused.
It’s presumably safe to say that the ideas of a SuperIntelligence won’t be confused.
Maybe a SuperIntelligence will be ultra-Parfittian [ see Derek Parfit's "Reasons and Persons": http://en.wikipedia.org/wiki/Reasons_and_Persons ] and do without any notion of an enduring metaphysical ego altogether.
Either way, might not the term “human-friendliness” be question-begging?
For example, we recognize it is extremely friendly to help toddlers develop into something radically different i.e. mature adult human beings. So is it “unfriendly” for an AGI to help to help adult humans develop into something radically different too i.e. mature post-human superbeings? Admittedly one trouble here is that the idea of becoming “post-human” is a little outside our comfort zone. Would a post-human really be “me”? But is my ancestral toddler namesake really “me” either? Today one often-invoked criterion of personal identity is “memory”. But if, for example, posthuman consciousness is naturally richer than human peak experiences, why would a post-human want to remember such ancestral states? By analogy, would you want to remember what it was like last time you were exceedingly bored?
Also, recall how morally queasy we are about cases where, say, congenitally deaf parents want to have a congenitally deaf child – so-called “elective disability”. By analogy, why should a “friendly” SuperIntelligence be constrained to (re)create dumb and malaise-ridden human beings when our matter and energy could be reprogrammed into, for example, blissfully intelligent post-human “smart angels” instead?
A final point. History suggests that humans have a strong genetic predisposition to behave in extraordinarily unpleasant and unfriendly ways both to each other and members of other species. A proposed human-friendly AI might “lock in” our nasty genetic malware for ever under the aegis of this supposed Superintelligence. By contrast, a true SuperIntelligence would presumably be able to transcend anthropocentric biases and grasp all possible perspectives. At the very least, IMO, an impartial friendliness to all sentient beings dictates changing human nature so it no longer reflects the genetic inclusive fitness of one species of African ape. A failure adequately to grasp all possible perspectives is arguably as much a cognitive as a moral limitation – and one I think we should all strive to overcome.