I was reading “Thoughts on Friendly AI” at utilitarian-essays.com, a site with short papers on various utilitarian issues, including AI friendliness. (An unfortunate aspect of incorrectly programmed strong AI is that poses a huge risk to humanity.) I wanted to point out several interesting positions in the essay, as well as respond to a few open questions. From here on out, I will refer to the author as “Utilitarian”.
Utilitarian writes:
I think the probability that humans will create an AGI is not trivially small; I wouldn’t put the figure below 0.01, and personally I would consider 0.15 or so to be a more reasonable Bayesian best-guess estimate. Thus, if the stakes are sufficiently high, work related to friendly AI may have enormous expected value.
Here, Utilitarian admits having a low estimate of the probability that humans will create strong AI — around 15%, but at least 1%. In spite of this, the author concludes that friendly AI work may have enormous expected value. This means that you don’t have to believe its particularly likely that AI friendliness is a big deal for AI friendliness to be a big deal anyway. This is because of the degree of power that a strong AI would have if it is indeed technologically possible.
Next, Utilitarian launches into a look at the Coherent Extrapolated Volition (CEV) programmatic concept for AI friendliness. It is pointed out that a CEV-based AI could lead to beneficial outcomes for non-human animals if humanity’s volition decides to include theirs:
CEV would be designed as a dynamic process in which the FAI would extrapolate humanity’s volitions slowly at first and then build upon those volitions in order to rewrite its code and improve the extrapolation process in subsequent iterations. So, for instance, if in the first round, humans decided that chimpanzee volitions should be counted (to the extent this is possible), then chimpanzees would be included in the second round.
My general comment: it might seem unfair program a strong AI to care only about the opinions of humans for the first round (this is the current plan), but unfortunately, anything else is too risky. What if we decide to program the AI to average our volition with that of our cats, and the cats end up outnumbering the humans, and it turns out that cats don’t like us all that much? Do we want a strong AI on the cats’ side? Probably not. On the same note, we should program the first strong AI in such a way that it doesn’t unfairly favor a small subset of humans.
Utilitarian then writes, provocatively:
However, the starting point–i.e., who will be extrapolated in the first round–is arbitrary, because we can’t rely on the CEV process to decide that for us. The current plan is to extrapolate only humans and allow them to decide whether to include non-human animals in subsequent rounds. But why stop there? Why not only extrapolate humans born in January and allow them to decide whether to include humans born in other months?
We might hope that all roads will lead to Rome and that all initial choices of the set of volitions to extrapolate will lead to the same result, but this is far from obvious. Thus, the choice of whether and to what extent to include non-human animal volitions in CEV is an important open question–one with which animal-welfare organizations might consider getting involved.
The designers of CEV assume that the category “all humans” is ideal for the initial input. It must be made openly clear that this choice is entirely arbitrary. For reasons I will describe in another post, I actually suspect it might be safer to use only one human being as the initial input. In any case, the author points out here that animal welfare organizations might be interested in lobbying for a place for certain animals, for instance the great apes, in the first CEV input stage.
Personally, I object to the killing of higher vertebrates if at all possible. I suspect that once that in-vitro meat becomes available, many people will “spontaneously” begin realizing that destroying animals for food was an ethical sacrifice all along, and the practice will fall out of vogue, like slavery. Would this be recognized in the first round of CEV? I’m counting on it, but who knows? If the answer is no, do I have a moral obligation to lobby that higher vertebrates be included in the initial CEV input? I don’t think so, because doing so might make the fundamental building block more complex, less stable, and more unpredictable.
We have an obligation to maximize the stability and predictability of the outcome of strong AI, because anything else is unfair to the people that have to live with it. It might not be possible to put the genie back into the bottle. An obviously flawed AI might be capable of self-perpetuating its influence despite any attempts to stop it, leading to an unpleasant period between its creation and Heat Death. This could be about 1040 years, a long time by any measure.
Do all roads lead to Rome? I hope so, but there is little way of telling in advance. One crutch used in the CEV approach is to have a way of peeking at the final outcome and vetoing it if it is obviously a failure. If a single person or exclusive group has the right to do this, one might ask, “how is this different than using just their volitions as input to begin with?” I see a difference, but it’s admittedly subtle. The elite group would have veto power, but it wouldn’t be micro-managing the outcome.
Utilitarian then writes:
It may be the case that animals don’t have an abstract enough sense of their volitions for CEV to work with them. If this is true, the same could be said of human infants. It’s not obvious to me that human infants deserve more direct influence over CEV than, say, pigs. If one makes the argument that human infants have the potential to develop into adults with a better sense of their true volitions, then replace “human infants” by “human adults with significant intellectual disabilities.”
It may be a good idea to exclude human adults with significant intellectual disabilities and human infants. Neurologically, all humans above a certain age are basically similar, but infants and adults with significant intellectual disabilities will be distinctly different. Where do we draw the line? I don’t know, but it seems counter-productive to include the input of minds that lack an abstract sense of morality. Most would at least agree that people in a coma cannot make moral choices in their current state.
Utilitarian writes:
It’s plausible that the lives of most wild animals involve more suffering than happiness; this is especially likely if insects are sentient. On the other hand, most humans value nature highly and would prefer for wildlife to exist. I’m afraid that the CEV of humanity wouldn’t give enough consideration to the suffering of wild animals and, even worse, might create vastly more through terraforming, directed panspermia, or sentient computer simulations of nature.
My hope is that this concern would be addressed by the “if we knew more” part of CEV. If humans were more cognizant of wild-animal suffering and were able to more deeply imagine how horrible it is for, say, a frog to be swallowed alive by a snake, then perhaps they would be more reluctant to value “pristine natural environments.” And if their opinions were still unmoved, then maybe the impulse to preserve nature would be so strong that it would indeed have some merit.
If insects are sentient, we will figure it out soon enough. We should have faith in humanity’s ability to identify failings in our own morality and collectively improve, as has occurred since at least the Middle Ages. In any case, specially intervening to remove the possibility entirely would be a breach of ethics and interference from an elite group.
A similar concern relates to lab universes. If anyone were going to create infinitely many new universes in a laboratory, it would probably be an AGI. I’m concerned that humans would find the creation of new universes so exciting, cool, or unusual that they would ignore the fact that they would create an infinite amount of suffering in the process–and probably far more suffering than happiness
If lab universes are possible, hopefully we’ll come to a democratic conclusion that they shouldn’t contain suffering sentients. I don’t see why we wouldn’t.
In favor of SIAI, Utilitarian writes:
Of course, these scenarios assume that the friendly AI would be built correctly and humanely, but this is an argument in favor of SIAI’s work, rather than against it. Better to have a friendly AI determine the future of our part of the universe than a careless (or even malevolent) AI built by less circumspect programmers.
I will address the third part, “Religion”, in another post.
15 Responses »