Over at his blog, Roko has been giving interesting accounts of how he has come around to that “Human Values”, arbitrary-but-all-we-know moral system around which our (SIAI & friends) current best guess of what to do with seed AI is constructed. Previously, Roko believed that convergent subgoals were objectively moral, and was an adherent of moral realism in general. In his most recent post, Roko writes:

“It’s been an awesome journey since I started this blog nearly two years ago. My original goal was to understand exactly what we mean when we say that we want to “make the world a better place using technology”. The answer is actually fairly complex – and I feel that very few people within the transhumanist community actually understand enough to have grasped this answer. This is because the answer is, in a way, disappointingly un-transhumanist. The non-existence of external, objective moral standards means that we are far less morally obliged to push a high-technology transhumanist agenda than we otherwise would be, and that even if the future does lead us to the high rates of technological change that transhumanists have so worshiped, the optimal way for us to use that technology will probably look quite mundane compared to the grand visions of some transhumanists.”

Roko then points to my “The Future Might Look Like the Past… at First” post from early 2006, which is actually one of the inaugural posts on this blog. In it, I write:

“When we gain the ability to manipulate reality easily, most people will probably not choose to live within the sanitized white hallways of science fiction or the boring monoliths of The Jetsons. We will create more forests, rolling grasslands, huge gardens, splendid castles, and other things we can’t yet imagine. We’re all human, and most humans foster a romantic yearning to recreate some idealistic past. The true past was a place of disease and suffering, but we love the pleasant outlines transmitted to us through stories and our imaginations.”

Now, when I consider a post-Singularity world, I imagine something much more “mundane” than what I did around, say, 2002, during my “Computronium Shockwave” period. More similar to our present world. I believe there is a connection between how Friendly AI theories are created and what the set of imagined outcomes looks like, so this is important. Instead of worshiping technological progress, transhumanists should praise the fulfillment of human preferences, even if they are only tangentially related to technology or are quite low-tech. But high tech will be necessary to preserve the freedom required for people to pursue their own preferences. (Freedom from confrontation or undue influence by more powerful agents.)

In the comments on Roko’s last post, Carl Shulman describes some of the open challenges within the CEV approach, adding to the public discussion:

“It’s also important to remember the extent to which these fields still have open problems with significant importance. For instance, it’s very uncertain whether there is such a thing as a ‘Collective Extrapolated Volition’ for humanity that doesn’t depend very sensitively on the parameters of the extrapolation and aggregation. Issues like the Arrow Impossibility Theorem from voting theory, the malleability of focus for evolved moral sentiments, and the interaction between our number sense and motivation create looming challenges for such an approach.”

One fundamental challenge here, in my view, is that any proposed mechanism for self-correction at a high level of abstraction produces more opportunities for a catastrophic failure of Friendliness. For instance, programming an AI to analyze a wide selection of possible extrapolation and aggregation parameters and use some objective standard (presumably derived from the region of greatest coherence among previous extrapolation and aggregation attempts) to update moral preferences. The simpler the goal system, the easier it will be to update and revise without the risk of it drifting way off course. The problem is that the simplest possible goal system for success may have to be very complex, even in its basic principles.