Abandoning the Ghost of Moral Realism Tuesday, Mar 10 2009
friendly ai 11:20 am
Over at his blog, Roko has been giving interesting accounts of how he has come around to that “Human Values”, arbitrary-but-all-we-know moral system around which our (SIAI & friends) current best guess of what to do with seed AI is constructed. Previously, Roko believed that convergent subgoals were objectively moral, and was an adherent of moral realism in general. In his most recent post, Roko writes:
“It’s been an awesome journey since I started this blog nearly two years ago. My original goal was to understand exactly what we mean when we say that we want to “make the world a better place using technology”. The answer is actually fairly complex – and I feel that very few people within the transhumanist community actually understand enough to have grasped this answer. This is because the answer is, in a way, disappointingly un-transhumanist. The non-existence of external, objective moral standards means that we are far less morally obliged to push a high-technology transhumanist agenda than we otherwise would be, and that even if the future does lead us to the high rates of technological change that transhumanists have so worshiped, the optimal way for us to use that technology will probably look quite mundane compared to the grand visions of some transhumanists.”
Roko then points to my “The Future Might Look Like the Past… at First” post from early 2006, which is actually one of the inaugural posts on this blog. In it, I write:
“When we gain the ability to manipulate reality easily, most people will probably not choose to live within the sanitized white hallways of science fiction or the boring monoliths of The Jetsons. We will create more forests, rolling grasslands, huge gardens, splendid castles, and other things we can’t yet imagine. We’re all human, and most humans foster a romantic yearning to recreate some idealistic past. The true past was a place of disease and suffering, but we love the pleasant outlines transmitted to us through stories and our imaginations.”
Now, when I consider a post-Singularity world, I imagine something much more “mundane” than what I did around, say, 2002, during my “Computronium Shockwave” period. More similar to our present world. I believe there is a connection between how Friendly AI theories are created and what the set of imagined outcomes looks like, so this is important. Instead of worshiping technological progress, transhumanists should praise the fulfillment of human preferences, even if they are only tangentially related to technology or are quite low-tech. But high tech will be necessary to preserve the freedom required for people to pursue their own preferences. (Freedom from confrontation or undue influence by more powerful agents.)
In the comments on Roko’s last post, Carl Shulman describes some of the open challenges within the CEV approach, adding to the public discussion:
“It’s also important to remember the extent to which these fields still have open problems with significant importance. For instance, it’s very uncertain whether there is such a thing as a ‘Collective Extrapolated Volition’ for humanity that doesn’t depend very sensitively on the parameters of the extrapolation and aggregation. Issues like the Arrow Impossibility Theorem from voting theory, the malleability of focus for evolved moral sentiments, and the interaction between our number sense and motivation create looming challenges for such an approach.”
One fundamental challenge here, in my view, is that any proposed mechanism for self-correction at a high level of abstraction produces more opportunities for a catastrophic failure of Friendliness. For instance, programming an AI to analyze a wide selection of possible extrapolation and aggregation parameters and use some objective standard (presumably derived from the region of greatest coherence among previous extrapolation and aggregation attempts) to update moral preferences. The simpler the goal system, the easier it will be to update and revise without the risk of it drifting way off course. The problem is that the simplest possible goal system for success may have to be very complex, even in its basic principles.




“Computronium Shockwave” period? I’d love to hear that theory…
In all seriousness, I think that an important issue is raised here. If you look at what most people who are introduced to the concept of “transhumanism” end up believing in, it is definitely not the more refined humanistic transhumanism that has taken me somewhere between 18 and 36 months to home in on.
I think that there is a lot of work to do on getting the right message across, and I’m thinking of joining the h+ board at the next opportunity to help with the task. I feel that the memetic space of transhumanism is so full of heated arguments and wild claims (pro and con) that the *essential message* of using technology to further *our values* basically gets trampled. And it is a subtle message. Well, it seems that it took me a while to tease out the truth.
Thanks for the mention!
R
“Computronium Shockwave” was a geocities site I made in 2001 where I argued that supermorality would emerge out of a sufficiently intelligent AI. Shortly after that I began an intensive reading program of evolutionary psychology, cognitive psychology, and philosophy, and came to many of the same conclusions that have been laid out explicitly by Eliezer on Overcoming Bias over the past two years.
““Computronium Shockwave” was a geocities site I made in 2001 where I argued that supermorality would emerge out of a sufficiently intelligent AI.”
– It appears that everyone makes some version of this mistake…
“It appears that everyone makes some version of this mistake…”
This isn’t true, but I’d be curious to see some correlates of the view. In my case I was lucky enough to get early exposure to Hume and Mill in philosophy textbooks, and Hume’s treatment of the fact-value distinction rang true. That early exposure may have helped me to interpret my moral feelings as emotions rather than perceptions.
I’d appreciate any commenters contributing their own data points.
You’re on to something in your last paragraph, Michael, but I’m too tired and drowsy to add much of a substantively-helpful comment at the moment. Suffice it to say, it might be good to somehow enable an FAGI to revise it’s (applied-)axiological protocols (or even meta-protocols) BUT, you’re right to warn that this might be problematic the more complex the original protocols are, and there is fairly good reason to expect the latter to be rather complex/sophisticated indeed. What might solve the problem and allow for learning-and-revision (along the “edges” so to speak) is if some sort of *core* (and thus **relatively**, not especially simple, yet nonetheless **simplER** than the entire ensemble) set of (applied-axiological) protocols could be devised which would be (meta)stable even *through* learning-and-revision of some (perhaps even many) of the non-”core” protocols. That is, in other words, a “core” set of protocols which then generate ancillary ensembles of protocols, the latter open to learning-and-revision, while the former is not. But, of course, merely articulating this sort of theme is easy. Instantiating it is silicon (as it were) takes brighter and more learned minds than mine, and more resources than I currently have at my disposal, that’s for sure.
And as for the whole moral realism thing…I think one has to, at least broadly-speaking, be a moral (axiological) *cognitivist*, whether or not one is, strictly-speaking, a realist. I can only, now, off the top of my tired noggin, think of Richard Hare and Nicholas Rescher as two philosophers it might behoove someone in the FAGI field to check-out. Perhaps also Bernard Gert and Alan Donagan. No matter what categories one might pigeon-hole them into, these guys’ works are worth a look. Peter Vallentyne is a younger, more contemporary guy you might also check out…gotta go…
Ciao…
http://www.geocities.com/imminentsingularity, heh.
Welcome, Roko, to a more subtle, more sophisticated, and more coherent approach to increasingly effective promotion of our evolving values.
If you were to run for board election on this basis, I might even rejoin h+.
“Welcome, Roko,”
– thanks!
“If you were to run for board election on this basis, I might even rejoin h+.”
– I would be surprised if my position wasn’t already quite close to the average view that h+ puts forward. I was more saying that, at least up until now, this message has not come through very clearly…
Michael Anissimov is on the h+ board, so he may be able to comment… any thoughts, Michael?
Anyway, your support is appreciated, Jeff, Thanks!
That site is a blurry look into the past.
AI, the Unconscious, & Morality…
One of the more important discussions that is going on today rarely makes the news but whether or not true Artificial Intelligence is possible and whether or not such AI will be more intelligent than H. sapiens, may well be……
What you fail to mention is that an illogical and unsupported bias is a necessity imposed by fundamental physical limits on any decision making system. A super-intelligent AI will not be fundamentally different from us, and will have profoundly illogical reactions at times.
The first problem is simple : any sensor, no matter how sofisticated, is limited to a partial view of the world. This will introduce a gap, an uncertainty, a gap that needs to be filled before any “intelligence” (artifical or human) takes any decision. Therefore an ideal decision (by a human or an AI) will be based, just like our own decisions, on 10% facts and 90% unjustified and, at best, weakly supported, assumptions.
Worse : any AI will go into denial. It is easy to see why : suppose you have made a decision to carry out a simple task, say something trivial like opening a door. This is no easy task, and requires massive coordination efforts. You leg muscles must remain at certain tension levels, changing over time, that relate to the readings your eyes take and the readings coming from your ear. Your hand must move in sync with all those movements, must furthermore track the handle of the door, and create a feedback loop to prevent pushing too hard or to weak on said handle.
All these processes are continuous processes. Interrupting them is simply not an option. Interrupting your stride will, at 99% of the possible times to do so, simply cause you to fall down, flat on your face, potentially with a broken arm or leg or even death if the moment or direction of falling was particularly ill-chosen.
Suppose now that some disaster happens. A meteorite strikes (or something more mundane : for whatever reason, the door’s position changes, say someone storms in through the door), and the door is gone. You’re basically okay, but where the door was there is now a large gaping hole. Falling into said hole would be very ill-advised.
What are you going to do ? Well initially, you’re going to keep walking towards the position the door used to occupy. Many of the systems that are coordinating your movements will simply not believe that the situation has changed : your hand will still move towards the previous position of the handle (and still use the feedback loop to limit the forces it would encounter if it collides with the now-nonexistant doorhandle), your legs will still be coordinated to move your body towards the door, and even when your “conscious mind” realizes the fact that you’re basically throwing yourself into a hole in the ground, initially the commands will be to nevertheless continue forward, in order to avoid losing balance.
This is, plain and simple, “denial”. It’s not just denial, which is but one of many phsychological problems that plague humans, but it is a type of denial that is already shared between humans and existing AI systems.
Denial, and it’s more ugly siblings like schizophrenia, narcissism are simply consequences of the limits of our sensory perception. Obviously overreacting to them (like, say spending trillions of loaned money to solve a problem that is basically too much debt) is not, but how will the AI decide what, exactly, is an overreaction ?
But that problem is but a grain of sand compared to the whopper that Kurt Godel whispered into a friend’s ear about 70 years ago. You might think that more intelligence means that you can readily solve more problems. That’s true, but it is very limited.
All_problems > Math_problems > Solvable_Math_Problems > NPC > NP > aleph(2)
(aleph(2) is the number of real numbers that exist)
Any AI will find itself in the same position in this comparison as humans : in the lower regions of the NP domain. It may (may) be that he can get a bit higher than humans can, but until NPC becomes a solved problem, there is no way any AI will perform anything except slightly better than humans on NPC problems.
Even if it somehow magically manages to do what no human has ever been able to do, even in trivially simple cases with massive computer aid, and manages to solve NPC problems in a marginally efficient way, those problems are but a small subset of what this AI will encounter in everyday life.
The short of it : even with mathematically impossibly accurate sensors and a quantum algorithm running on a massive quantum computer, an AI will not have but look outside his window to see thousands of problems he cannot hope to provide a rational answer to.
Worse, he will, just like humans, have to provide an answer to those problems that will not bend, even an inch, to rationality.
In other words : any AI that hopes to exist in the real world, no matter how powerfull or advanced this AI and the machine that is his/her body might be, will have to make dozens of decisions not based on logically sound arguments, before he even exits the door of the room he/she was created in.
This AI will, so to speak, be intimately familiar with “the human condition”. This AI will be human, unsure and alone, just like the rest of us. He (or she) will have a religion, for he will soon realize that you cannot just pull a working morality out of thin air and hope it works. He or she will let that religion override at least some rational determinations he or she might have made with a different religion (the definition of “bias”).
This way of acting is not even irrational, no matter how many people walk on water or skies rain fire on gays in the relevant religion, because there is no alternative (even though some might think atheism is not a religion, atheism in itself is an irrational bias. And that’s in the absolute optimum case. Don’t forget that the largest group of atheists in history (and presently) are the communists (they prefer the term “socialists”). They, however, have a religion. Socialism. A wrong one at that).
Any “superhuman” AI will be as human as you or I. Our biggest mistakes will be to make only one of it, and not raising it like we would raise a kid, allowing time spent learning physics laws, time spent kicking a ball around, time spent honoring something bigger than itself (and for obvious reasons, let’s introduce this AI to Jesus and not to he-who-must-not-be-named), not knowing if it really exists and time spent falling into some girls’ beautiful eyes.
That would, by a very large margin, be our very best bet of preventing this AI from going haywire. The building blocks of madness are present even in today’s limited AI’s, and we can’t take them out.