Cross-posted from SIAI blog.

Consider two classes of AIs. One class of AIs manipulates external objects to direct the world towards a goal state, the other doesn’t. AIs with the greatest real-world impact fall into the first category. The objects may be virtual as well as physical, although they’re both ultimately the same thing, as reality is harmonious and unified.

Within the first category, there are AIs with motivations that output the open-ended, indefinite manipulation of external objects, and AIs with motivations that cause the manipulations to stop after a critical threshold of utility maximizing (or satisficing) is performed.

A CEV-AI is an example of the latter category. It extrapolates humanity’s volition, creates an optimizing process that embodies it, then shuts itself down. There’s a technical problem here–how to program it in such a way that it doesn’t attempt to turn the planet into a supercomputer to compute humanity’s volition, disintegrating humanity in the process? Some version of interim Friendliness no doubt, but remember, the CEV-AI’s primary job is to output humanity’s collective will, not be nice to humanity on a day-to-day basis. I’ll let the Friendly AI theorists try to figure that one out.

But back to the categories, I would think that most possible AIs fall into the first category: the open-ended, indefinite manipulation of external objects. In fact, most intelligences probably do. If a human life were extended to a quadrillion years, those quadrillion years would likely consist of the manipulation of external objects. Same thing if you extended the life of a chimp, or a badger indefinitely. The results might get boring pretty fast (rest, eat, sex, rest, eat, sex), but that manipulation of external objects would keep on going.

Imagine a sim-world, maybe something like that game Spore, that Will Wright thinks will change the world, with an indefinitely-living couple, be it chimp or human, living in it. Eventually their semi-random walk and offspring would encompass the world, and in the case of the human, they might even learn how to convert the world into billions of O’Neill colonies for maximum usefulness. When minds have an open-ended desire to manipulate the external world, in the long run, things never stay the same.

Because AIs would be running on accelerated substrates, the “long run” for them could be a few minutes or hours. An AI with an open-ended desire to manipulate external objects will eventually pattern over anything not to its liking, like a gardener will eventually pluck all the weeds in a garden if he has the time to do so. That’s why it’s damn important to make sure the first AI considers us, with all our flaws and imperfections, to be in alignment with its goals: if not, we’re toast in the long run, and for the AI, the long run ain’t very long at all.