One of the greatest divisions between pro-Friendly AI transhumanists and anti-Friendly AI transhumanists may be a disagreement about whether unconditional kindness is physically possible. In a recent comment at the IEET, James Hughes said in response to Kaj Sotala:

I also do not believe in the possibility of a super-AI of the type you imagine capable of doing these tasks which did not have some kind of self-interest, or was not programmed to serve the interests of some group more than others. I think the notion of such a purely altruistic creatures is sublimated religion.

I’m not so convinced, but do note that SIAI threw out the idea of normative altruism as a goal system for Friendly AI some time ago, and replaced it with Coherent Extrapolated Volition (CEV). Still, I consider it plausible that the CEV output will result in some version of an unconditionally altruistic agent, so the question is important.

In a comment I made to the IEET that never appeared on the page (due to spam filter issues), I pointed out that our level of altruism towards other beings is roughly contingent upon how much shared genetic material we have with them. This is called kin selection, and is best expressed by the population genetics joke where one person asks another, “Would you give your life for your brother?”, and the other responds, “No, but I would give my life for four nephews or for eight cousins.”

From the Wikipedia page on inclusive fitness:

From the gene’s point of view, evolutionary success ultimately depends on leaving behind the maximum number of copies of itself in the population. Until 1964 it was generally believed that genes only achieved this by causing the individual to leave the maximum number of viable offspring possible. However, in 1964 W. D. Hamilton proved mathematically that because close relatives of an organism have some replica genes, the gene can also increase its evolutionary success by promoting the reproduction and survival of these related or otherwise similar individuals. This leads individuals to behave in a manner maximizing their inclusive fitness, rather than their individual fitness.

I suspect that Dr. Hughes may be almost a half-century behind the times in that he considers the notion of a purely altruistic being to be “sublimated religion”. After all, eusocial insects, and the two eusocial mammals, two species of mole rats, have taken kin selection to such an extreme that they will engage in self-sacrificial behavior to protect their colony. This is because genes only “care” about perpetuating copies of themselves — if a unique copy is only found within one individual, then individuals in that species will be self-interested, but if genetic material is shared to a high extent among the group, then selfless social behavior will evolve.

At the moment, I can conceive of several objections for why this behavior could not extend, even in principle, to superintelligence for humanity.

1. Superintelligence and humanity would be on two different levels, whereas eusocial insects and the like are of the same species.

2. For some reason, for organisms more intelligent than mole rats, eusociality cannot evolve, even in principle. There is just something about high intelligence that is inherently antagonistic to unconditional kindness. Perhaps it has to do with unconditional kindness being inherently “dumb” in some way.

3. Humans are inherently nasty creatures such that even a being programmed to love us would find it impossible to do so.

4. AI cannot do X, because AI lacks the magic juice that makes altruism possible in some humans and other organisms.

Maybe there are better objections than the above, but that’s just what I came up with off the top of my head. Note that at the moment I’m just addressing the feasibility of universal altruism question rather than whether it is practical to program. I have actually heard some version of all the above objections, so they are not straw men, though perhaps I am rewording them uncharitably for what I consider to be clarification.

Let me respond to each in turn. Claim #1 has evidence both for and against. Obviously, we have no superintelligence in front of us, so testing the claim explicitly is impossible. But I disagree with Vinge that no evidence can ever change our opinion on the issue. That is a topic worthy of another post, and I encourage anyone who is reading to think about predictions of the behavior or needs of superintelligence that we can make with a relatively high confidence. For instance, that superintelligence would need energy. Therefore, superintelligence would have to engage in some energy-seeking behavior. Bam, the “unpredictability horizon” hypothesis is false.

There is at least one piece of evidence mildly against claim #1, that too large of a gap would render eusociality impossible. Notice the gap of power between soldier ants and drones or queens. The soldier ants in a colony could easily kill drones or queens, but don’t, because of evolutionary motivations sculpted by kin selection. The same can apply to the power differential between parents and children. Parents could kill their children, but mostly don’t, because of evolutionary motivations. Observe how step-parents are more likely to abuse their children: why? Probably because the child is not genetically derived from the adult and therefore the adult has less evolutionary motivation to preserve or help the child.

The flow of kindness in ecological systems is obviously crafted by evolution, and can be estimated quantitatively using the concept of inclusive fitness. Some aggressive fish refrain from eating other, smaller fish that clean them because they have a symbiotic relationship. Symbiotic relationships evolve in all sorts of places. They exist because of cognitive programming and evolutionary pressures. The multiplicity of examples of “kindness” in nature makes a strong case that an agent that is universally altruistic is probably possible. The way that such kindness evolved through evolution also provides a strong argument that we will eventually duplicate it on our own, through reverse-engineering or abstracting the appropriate dynamics.

I will refrain from going into #2-4 because I am afraid of them being called straw arguments. If any of the commenters believe in any of these points, or have other ones to share, I encourage you to do so.