Will Any Sufficiently Intelligent AI Designer See the Necessity of Friendliness Theory?
As I'm working my way through the Greene dissertation (currently on page 163, thanks to Roko for pointing this out to me originally via his blog), I feel myself getting more optimistic about a certain issue: that is, that any sufficiently intelligent human will figure out that morality doesn't generate itself automatically in the absence of very sophisticated and specific cognitive hardware. This is wonderful, because it lessens the probability of unFriendly AI, that perennial Singularitarian bugaboo.
In the Greene essay, he points to three things: psychopaths, Phineas Gage, and similar patients with damage to their ventromedial frontal lobes, which are apparently indispensable to generating moral evaluations and feelings. They all display significant intelligence but the inability to have strong feelings about right and wrong, or to distinguish between moral rules and conventional rules. This proves that intelligence is independent from morality. These findings were buttressed with fMRI studies Greene did on criminal psychopaths.
Anyone who philosophizes enough will figure out that morality is in the human mind, not out in the world. Anyone who does fMRI brain scans on psychopaths and normal people or is aware of such work will realize the same thing. Also, I am hopeful that anyone who develops a sufficiently well-developed AI design will realize the same thing. In the best-case scenario, unFriendly AI simply doesn't happen because people realize what they're doing before they finish the task. Unfortunately, this is wildly overoptimistic, because just recognizing the need for a complex and subtle morality is not enough to build one.
Continuing on my optimistic tangent, I am hopeful that anyone building human-equivalent artificial intelligence will at least make a token effort to instill it with some sort of top-level morality. There is little danger of this being Asimov's laws, because Asimov's laws are not descriptive enough to really set an AI or robot in motion. Any attempt to build a goal system based on Asimov laws will have to introduce a heck of a lot more complexity to have a product that has any use. If this morality is then applied to test cases, and it would most hopefully be if the AI in question has any significant responsibility, then it will quickly become apparent that morality is not a free lunch and that the AI fails miserably on many moral tests. If pre-human-equivalent AI does significant damage due merely to accident (an issue I am exploring at length by reading Moral Machines by Wendell Wallach and Colin Allen), then a hands-free approach to AI morality will get a bad reputation, and there will be a strong profit motive to produce genuinely Friendly AI. Another obvious pitfall here is that the programmers craft a morality sufficient for a non-self-improving human-level AI but which fails catastrophically and fatally (to us) when the AI improves itself. But at least it would fail for that reason rather than the programmers making an even dumber error.
Basically, this post is an exercise in optimism -- I spend much time on this blog wringing my hands so vigorously that one might think the skin will be sloughed right off. But it's Sunday, so here I am saying, "what if... [optimistic scenario]?"
Going even further back than Greene, who just published his dissertation in 2002, we have Hume, who said, "'Tis not contrary to reason to prefer the destruction of the whole world to the scratching of my finger", showing he was completely aware of the reason/morality distinction, and in fact is the premier historical philosopher in favor of this view. It seems to be in the interest of those concerned about unFriendly AI to spread the moral philosophy of Hume as much as possible. Ditto with moral sense theory, a non-cognitivist (non-realist) theory that morality is grounded in complex sentiments and emotions. The seeds of the idea that Friendliness theory (challenging study regarding how to create a Friendly AI) is necessary are planted right there.
March 15th, 2009 - 11:45
Is it possible that morality also depends upon certain specific environmental conditions, as well?
I mean, morality is in the mind, insofar as its about how minds act. But it’s also very much in the world, because that world happens to be full of other minds.
It would be nice to think we can overcome what we might lack in necessary environmental conditions with more stringent mind design.
March 15th, 2009 - 13:14
This is remarkable consistent with the thinking I did on my GURPS Transhuman space Roleplaying game setting (that never launched). In this game you have NAI, LAI and SAI variants – Nonsapient, Low sapient and Sapient. THS didn’t postulate superhuman AI.
I looked at this and came to the conclusion, somewhere in 2001, that all AIs with be as methodical and ruthless as killer wasps, or black widows, or mantises. I postulated (in a RPG, D&D spell-like paradign) you would have these programs you could (and legally would have to) patch into any AI, preferably on “hard to revoke” hardware level to literally simulated a pseudo-human psychology. In the game setting I labled this a ‘social interface’, i.e. a totally inhuman MDV (killing robot) roaming asteroid fields would be partially as cold and ruthless as any T1000, but inside its operational programming would be a simulation of something resembling a human.
The American and Chinese models wouldn’t have citizenship rights, but in the EU all applications of SAI types would legally (and safely!) be given citizenship status.
The point is, human minds have “somewhat facultative” moral subroutines. You don’t know in which humans they are strong, and in which humans it is nonexistant. (Being an outspoken neoconservative may be a clue to moral incapacity!) – the programming in the military AI that emulate alliance and social behavior and morality (and a sense of trustworthyness) is one that incorporates social behavior.
Which brings the mind the Matrix movies, where AI minds rubbed shoulders with human minds, in meaningful ways (the oracle is a sweet lady, merovingian such a charming guy, persephone is a good kisser!).
In this Gurps THS “somewhat campy” RPG setting, characters would of course interact in deep and meaningful ways with characters and other humans. It means something if you go into space battles with an SAI if you share her persistent “bunny furry fetish” in virtual reality interaction. It builds trust, comraderie or (even better) love.
While the THS metaphor may be typical RPG fare (even if fairly sophisticated) I think we could do worse if we started out by emulating human core values and human core psychologies in even the most stale of AIs. Interacting with a self-generated avatar of an dedicated accountancy AI would be very illuminating.
March 15th, 2009 - 15:29
That human-Friendly morality must be somehow implemented in AI is only the first step necessary to start thinking about addressing the real problem. The next step is to realize that it’s impossible to implement this morality explicitly, that one has to build a mirror, rather than a picture. Many people make the first step, but virtually no one has properly done the second. There are also mixed states, when people try to explicitly specify some parts of morality, and use ad-hoc ways to “teach” the rest. And of course there are fake states, in which people mistake the specific caricature they conjured up for the whole human-Friendly morality. This is the third step: not only can’t you explicitly program morality in AI, you can’t also explicitly imagine what it is about.
March 15th, 2009 - 16:18
Michael (Anissimov): I heartily concur with you on the importance of Hume. But even more important is Hume’s close friend and really the doyen (even more so than Hume or Adam Ferguson, at least imho) of the Scottish Enlightenment, none other than good ol’ Adam Smith. Smith’s two most famous master-works—both unquestionably magisterial for their time, and, arguably for ours—are, of course, *The Wealth of Nations* and *The Theory of Moral Sentiments*. This latter work is, again imho, one of **the** most important treatises in ethical/moral philosophy ever written (much more important than even Kant [who is not unimportant] and at least every bit as important as even Aristotle). The best study/explication of what Smith actually accomplished in *The Theory of Moral Sentiments*—nothing less than how **human** moral protocols and sentiments evolved and developed—is superlative discussed in, again, James Otteson’s recent book, *Adam Smith’s Marketplace of Life* (Cambridge U. Pr., 2002). Note the date of publication. While Otteson adumbrate some of his book’s themes in earlier publications, much of what he says there couldn’t have been taken into account by Greene in his dissertation, yet I think that a synthesis (oooh, yuck that smacks of Hegelianism, don’t it? LOL) of the themes Greene deals with together with the themes found in Smith/Otteson would be on the right track…
And as far as normative reasoning—which is inherently both cognitive and conative—goes, some other important works are: the late Alan Donegan’s *The Theory of Morality*, the late Alan Gewirth’s *Reason and Morality* and *Self-Fulfillment*, and Richard M. Hare, *Moral Thinking: Its Levels, Methods and Point* and *Sorting Out Ethics* (this latter, of course, a bit more than the former, is primarily concerned with metaethics)
But, Michael, I don’t expect you to read all those—but please do get Otteson’s *Adam Smith’s Marketplace of Life*: it is a brilliant exegesis of a work that David Hume himself considered to be one of the most important treatises on ethics he’d ever read (it—*The THeory of Moral Sentiments*, that is—having appeared [in 1759] 8 years after Hume’s own [1751] *An Enquiry Concerning the Principles of Morals*, and being heavily influenced by that earlier work. PLEASE check out both Smith himself and Otteson…you’ll be glad you did…
Just from having read some of Ben’s (Goertzel’s) stuff, I suspect he would appreciate Smith’s accomplishment and Otteson’s expository analysis.
Anyway THANKS to you—and to Roko!!—for the Greene dissertation reference…
Ciao… ;)
March 15th, 2009 - 16:57
Will Any Sufficiently Intelligent AI Designer See the Necessity of Friendliness Theory?
– One wonders what is going on in the case of Peter Voss, then, doesn’t one? Or of the 50 or so AGI researchers at AGI ’08 who said they’d rather exterminate the human race than stall AGI development….
In this case the implication is that none of these people are that clever.
March 15th, 2009 - 16:59
“Anyway THANKS to you—and to Roko!!—for the Greene dissertation reference…”
– Much appreciated, Michael and MCP
March 16th, 2009 - 23:50
i could see an environment with AI worker scientists kept sufficiently distant from eachother to the degree you get a http://en.wikipedia.org/wiki/Bystander_effect type thing goin. each thinks ‘oh the other guys must be working on the morality part’.
and of course there are scientists/groups of scientists who will deliberately seek the construction of some sort of immorally inclined AI.