Legg and Mijic on Friendly AI
Two frequent Accelerating Future visitors/commenters, Shane Legg and Roko, recently got together in London to talk Friendly AI. Here's an excerpt:
It was great to discuss some of the problems that AGI/FAI research poses with someone else who is in the know about the subject. Shane has been at the cutting edge of both theoretical and practical approaches to AGI, including working with Marcus Hutter on AIXI and formal definitions of intelligence, and working for both Ben Goertzel and Peter Voss on some not-quite-yet successful practical attempts at AGI. He now works at the Gatsby computational neuroscience centre, a world-leading centre for the study of the human brain.
It was something of a disappointment that when we spoke about the dangers of AGI to humanity, I was the more optimistic one. Shane's way of putting it was something like:
"We're completely f**ked, we've got about 15 years before unfriendly AI kills us"
this is, considering his expertise, very worrying.
Heart-warming, isn't it? The fact of the matter is that "optimism" for AI often means pessimism for humanity's future, because self-improving AI without a moral sense is much easier to create than self-improving AI with a strong moral sense. Most futurists fail to grasp this for one of two simple reasons:
1) Belief in moral realism, that any sufficiently intelligent being will "discover" "the right morality" with little to no specific effort on the part of the programmers. Thankfully, moral realism is being annihilated by advancements in cognitive science, evolutionary psychology, and philosophy, and I've recently become more confident that any sufficiently intelligent/philosophically sophisticated/rational/whatever human being will realize that it's bunk.
2) Expectation of enough product/consumer feedback cycles that by the time general AI rolls around, all Friendliness errors are pretty much corrected, including ones that would only emerge under strong self-improvement in the superintelligent realm. This is pretty short-sighted, looking back at the history of products that remained on the market for many years after they were proven dangerous because they were simply too useful to give up that easily. My hope is that this dynamic will shed light on some of the most obvious potential failures of Friendliness, which will in turn inspire widespread recognition that Friendly AI-completeness is not a free lunch.
There is also a line of investigation in the post where Roko writes, "Also, I claimed that there are a lot of people on the planet who would really care about the FAI problem if only someone would tell them about it in a non-crackpot way. These include philosophers, prominent environmentalists, most of the Guardian reading left-wing middle class, most scientists and academics. Quite a lot of people, really. And the main impediment to progress in FAI is lack of money, credibility and human resource."
My perspective on this is that I really doubt it, but I won't stop trying. Fantasies by fringe groups that the mainstream will adopt their fringe view are as old as fringe groups themselves and have a terribly low ratio of coming true. In my experience, presentation matters only marginally -- either people care about Friendly AI or they don't. Saying "we're worried about AIs self-improving and converting the planet into computronium, killing everyone on the planet in the process" does not typically elicit a radically different reaction than saying, "we're working on decision systems that embody the subtlety and complexity of human morality in a computationally tractable way". Both sound far out, no matter how they're dressed, and even if you get someone involved with "non-crackpot" language, further investigation demands "crackpot-like" language by necessity. If someone doesn't have the perceptiveness to tell the difference between crackpot-sounding but genuinely important lines of investigation and everyday crackpot-ness, then any complex meme you transfer to them is likely to be warped beyond recognition before they pass it on to someone else.
This hypothesis is empirically supported by the fact that people either seem to think Friendly AI is a big deal very quickly after exposure (within a year) or not at all. There are very few counterexamples. People like Steve Jurvetson and Peter Thiel understood the Friendly AI problem shortly after being exposed to it, not because they were exposed to it the right way but because they were smart enough to see the essence of the issue. Others will follow not when the idea is presented the right way, but when it has enough associated status, through public endorsement by prominent figures, that they can mention it at a cocktail party (along with the prominent figure in question) in a way that gives them a substantial chance of sidestepping the ultimate horror of social ridicule.
It is interesting to observe that most people who get really interested in Friendly AI are either high-status enough to be confident in what their own brain tells them (compsci Ph.Ds like Matt Mahoney, VCs like Jurvetson and Thiel, thought leaders like EFF chairman Brad Templeton, and countless others) or low-status enough to believe what they want without backlash from their status-obsessed peers (myself at age 17). The middle is where you get into trouble.
April 2nd, 2009 - 04:15
I realise that this should be obvious to most readers, but for those who missed it as of course they weren’t there in person at the time: yes, I was joking around a bit with Roko when I made these comments. For his part, Roko’s motive for a positive singularity involves Liv Tyler, but with red hair. Apparently. Hey, each to their own ;-)
More seriously, I suspect that we may somehow manage to avoid catastrophe. I think a lot of people suspect this, which is probably a bad thing. Blame Disney movies as a child. I guess that’s the optimist in me. Rationally, or at least more analytically, I have to say that I’m really not sure *how* we’re going to manage this feat. The problem looks, well, hmmm. Complex. Subtle. Excessively high risk (in an economics high-outcome-variability sense). And some days, yes, just plain out scary.
I do believe that good solutions exist. Furthermore, I don’t think they will be so hard as to be impossible to find with some serious human resource investment. Perhaps then Roko is right in suggesting that we need to take this idea more into the scientific mainstream.
Fro my part, I’ve been testing different arguments on academics I’ve met over the last year or so, with increasing success. Either my talk is getting better, or the mood among researchers is changing, or both. In any case, it’s moving in the right direction. The really hard part is going from “Yes, you’ve got a point. It could happen and if it does it could well be a serious problem.” to “I want to help.” So, yeah, my pitch seems to be getting better, but I’m having trouble “closing”, as they say in sales.
April 2nd, 2009 - 05:49
So you suspect that we will manage to avoid to catastrophe, as an intuition, then you go searching for evidence to confirm that intuition?
Good luck with those academics! The atmosphere is improving. In my mind, many who say, “yes, you’ve got a point” are more saying it to be polite, but I could be wrong.
April 2nd, 2009 - 06:25
Yes, that’s my intuition, though I am suspicious of my intuitions. I just look for evidence that I think is robust.
I don’t think the people are being polite. In these environments people are constantly disagreeing about things, even if the person is a visiting big shot, let alone a mere post doc. I think the most significant factor working in my favour is a slowly creeping sense among AI (and related) researchers that some real progress is being made. The idea of building a machine with general intelligence is still a little crazy, but it’s no where near as crazy as it was 10 or 15 years ago. At least that’s my experience.
April 2nd, 2009 - 09:07
“Roko’s motive for a positive singularity involves Liv Tyler, but with red hair”
– clearly this would be *awesome*. Utilizing the optimizing ability of an AI to create the perfection of womanhood is, in my opinion, a very healthy goal to have. It’s a great way to focus the mind on *actually creating the FAI* rather than on having fun arguments with other researchers.
“Fro my part, I’ve been testing different arguments on academics I’ve met over the last year or so, with increasing success”
– yes, Shane’s pitch is good.
After seeing Shane in action, I was stopped by a charity worker soliciting donations for UNICEF outside king’s cross station. I told her that I didn’t want to give £3 a month to them because SIAI was a more effective use of the money. She actually asked for the SIAI website once I’d explained things.
April 2nd, 2009 - 09:30
Roko, I think it’s incorrect to say that there’s any objective perfection of womanhood, but I guess you can say you’d be interested in your own subjective conception of perfection of womanhood. (Which could change sooner than you think after being desensitized to it.)
Also, the notion of arbitrarily creating conscious beings from scratch gets into ethical issues that may involve the eventual forbidding of such an act.
My Singularity fantasy is to be able to stop worrying about the Singularity.
April 2nd, 2009 - 09:45
“My Singularity fantasy is to be able to stop worrying about the Singularity.”
ROFL… No! That’ll suck for you! We’ll be like the men who went to the moon: they all struggled to come to terms with the fact that they’d done the most impressive thing in their lives and that it would all be somewhat downhill from there.
“Roko, I think it’s incorrect to say that there’s any objective perfection of womanhood, but I guess you can say you’d be interested in your own subjective conception of perfection of womanhood.”
Of course. That was what I meant – I was just helping myself to a bit of realist language.
“(Which could change sooner than you think after being desensitized to it.)”
But the period of “desensitization” would sure be fun!
Basically it’s another one of those useful motivations. Visualize concrete things that an FAI will give you and you become a better FAI researcher.
April 2nd, 2009 - 10:56
“I was stopped by a charity worker soliciting donations for UNICEF… She actually asked for the SIAI website once I’d explained things.”
Dude! That’s so cool :-)
April 3rd, 2009 - 02:27
> High-status…low-status… The middle
> is where you get into trouble.
I am firmly in the middle.
I am embarrassed about appearing crackpottish, but I have explained FAI etc. to various people whose intelligence I respect.
1. Family members.
2. Senior software engineers.
3. Others, including one who was well-educated in the humanities; one who is an autodidact in a variety of areas; and one who is not a member of the intelligentsia.
I only do this when I get the opportunity to speak in a relaxed bull-session-like environment, when people are accepting about speculative thought, and the result usually fairly positive. A very few accept the arguments, some pass them over without judgment, and some dismiss them but without mockery. Only a few show suspicion.
In any case, now that a Google search on me brings up an association with these ideas, it’s too late!
But I wouldn’t dare bring this up at work, not even in a lunch-time chat, and I work in software….
Michael, I think you are right about the need for high-status endorsements, but Roko is also right about the feasibility of explaining this stuff. The two work together.
How did people start worrying about contaminated sausages, child labor, DDT, or global warming,to name a few issues of the last hundred years?
April 3rd, 2009 - 04:03
Michael,
I don’t think that it is accurate to say that moral realism is being ‘annihilated’. In fact, I think that the low water mark was probably about half a century ago when the non-cognitivist theories of the early part of the 20th century had proliferated and gathered a lot of steam. Since then, there seems to have been a renewed acceptance of realist theories. I’m not sure what has been happening on a smaller timescale (like the last decade).
I should say that I am writing about belief in moral realism among philosophers, but this is fairly natural as these are the people who defined what it is, and are experts about the debate etc. I do not consider myself an expert in meta-ethics (though I have taken graduate level coursework in the subject and am qualified to teach it), but I expect that I am more familiar with the background, and what philosophers mean by these terms, than is typical in the transhumanist community. I lean towards a naturalist position on metaethics, which is sometimes described as realism (and sometimes not), and I don’t think that transhumanism has anything to fear from moral realism. Indeed I’m not sure that the term is even being used to mean the same thing as in philosophical use. For example, I don’t think that the results in neuroscience or psychology have any bearing on the philosophical debate.
April 3rd, 2009 - 05:49
Toby,
The reason transhumanism has something to fear from moral realism is that if moral realism is true, then we can expect any sufficiently intelligent AI to find the true morality. Because I don’t think that such a thing is possible, to build such an advanced AI in the absence of specific engineering for moral cognition would be very unwise. Moral realism encourages people to just ignore Friendliness and build advanced AI without worrying about the morality part so much. (I personally know over a dozen such people, and some of them are in the forefront of AGI research.)
If you don’t think that results in neuroscience or psychology have a bearing on the philosophical debate in moral realism, we must have a very different understanding of it… most recently I read Joshua Greene’s doctoral dissertation, “The Horrible, Terrible, No Good, Very Bad Truth About Morality and What to Do About It”, which drew many of these connections. I also observe that I abandoned moral realism (and the idea of objective morality) when I began reading in evolutionary psychology.
A blank slate view of human nature encourages moral realism because it makes humanity look like an optimal fixed point in mind space, or even a quintessential mind. Evolutionary psychology and neuroscience are the solvents that eat up blank slate views of the mind by revealing its internal structure.
I’m pretty sure that I mean the same thing by moral realism as it means in philosophy. While I’m not an academic philosopher, I’ve read the works of many moral realists and non-realists.
Does your naturalist position of metaethics adhere to the three propositions listed on the Wikipedia page for moral realism?
It seems pretty clear to me that ordinary (secular) people think of morality in much less objective terms today than they did in, say, the 1950s.
April 3rd, 2009 - 08:02
Toby Ord
> “Since then, there seems to have been a renewed acceptance of realist theories”
Can you give me some references on this please, Toby?
April 3rd, 2009 - 15:47
A minimalist moral cognitivism/naturalism—of the kind found, in, say, Adam Smith (and, again, see James Otteson’s work on Smith) need not at all be lumped or equated with any strong (or even not-so-strong) moral realism—if by realism, one means that any mind (Martian, AGI, what-have-you) will, inevitably, and by reason alone, converge on the same set of values and normative protocols). (Lewis Beck, e.g., has written on this sort of moral “universalism” and he was also a premier Kant-scholar.) Human morality may have certain fundamental universals, but this doesn’t necessarily imply a very strong moral realism. And it would seem, Michael, that you actually have a rather pragmatic reason for skepticism about moral (or axiological) realism—and one with which I completely sympathize. Merely because humans have, over millenia, evolved normative systems which *seem* objective to most humans (and thus *are*, indeed, quasi-objective) and whose *fundamental* notions and components show little variation from Australian Aboriginals to Siberia Inuit, it is simply a *non sequitar* to conclude that an AGI will (at least eventually) “hit upon” some sophisticated, over-arching moral theory which either is co-extensive with evolved human morality, or else includes the latter as a subset or “special case”. This has always been, to me at least, so intuitively obvious (indeed, seemingly almost-axiomatic) that the “Friendliness Problem” has also, from the get-go, seemed rather obvious and vital. (Which is why I’ve always been a bit perplexed by the cavalier non-attention by such as, e.g., Moravec.)
The Friendliness Problem is, How does one codify (algorithmically?) and instill a “moral” or “normative” sense that allows for—indeed, strongly implies—a FAGI which is capable of empathy, of *caring* (Cf. Milton Mayerhoff’s excellent study, *On Caring*) and of *virtue*. For that is, ultimately, what it comes down to: A caring, empathetic, virtuous superintelligence. Yet Eliezer has, in a recent series of discussions over at Overcoming Bias (http://www.overcomingbias.com/) repeatedly hit-home the difficulty in trying to straightforwardly program—or even formulate— moral protocols. So it is a tough nut to even fully crack-open, much less “get right”.
Fortunately, I think Ben (Goertzel) is on the right track (as best I’ve been able to—rather spottily and intermittently—keep up with his R&D program(s)) in creating a VR “kid” and letting the kid learn/develop general intelligence, and, a part of that development, gain a nascent moral (or at least value) sensibility. This, of course, is coupled with Eli’s work in trying to actually specifiy and codify broad (meta)normative protocols. It may also ultimately take *human* interaction (i.e., brain-VR environment interaction) to really bring home the bacon on this one. And it’s way-beyond my area of competence (by a country-light-year, so to speak LOL) to give relevant criticism.
But I think an understanding of how human morality developed (along parallel lines with the development of human language—see again Otteson’s work), along with a study of virtue ethics and virtue-psychology, plus an understanding of the psychology and logic of other-regarding (see in this regard Tom Nagel’s classic *The Possibility of Altruism*) will surely aid in the developmental path toward a truly high-functioning FAGI. Shane’s probably right, though: At best we have no more than about a 15-yr window to lick the Friendliness Problem or we’re in deep sh*t—this concurence coming, admittedly, from a non-expert.
I was up much of the overnight last night, and yet got up this morning at my more-or-less usual time—so I hope this is at least minimially coherent. Plus, my knee’s aching so it’s time to quit for now. Thanks to all earlier commentators, and to you, Michael, at to Shane and Roko in particular. (Btw, I just turned 51 on 3/31, yippy-kai-yay…and I’m expecting fully-functioning FAGI w/in my lifetime…)
Ciao…
April 3rd, 2009 - 15:58
Michael,
1) I do agree with the three propositions on the wikipedia page (though I’m by no means certain).
2) Perhaps any ‘sufficiently intelligent’ AI can work out the truth about ethics, but a big question is what we mean by ‘sufficiently intelligent’. Obviously it could be much more intelligent than the smartest human and yet fail to work this out though. It could also fail to *care*. There is a big gap between ‘morality is objective’ and ‘any super-intelligent machine will be able to determine the moral truth and will be motivated by it’, though you are right that the former could be used to offer weak support for the latter.
3) I agree that more secular people have come to think of morality in less objective terms, but largely this is the rise of moral relativism, rather than any of the other irrealist theories. Very few philosophers give any credit to moral relativism. The irrealist theories that have been popular among philosophers are quite different.
4) I really am quite skeptical of the relevance of recent scientific results to this debate, but a full explanation on my part will have to wait to another time. I’m not saying the research is uninteresting or that it isn’t suggestive, but it is harder than it looks to actually generate philosophical work from this kind of science.
Roko,
I’m really just referring to the general mood and can’t give specific citations. However, I think that the following book is an excellent introduction, and should be available in most university libraries:
http://www.amazon.com/Introduction-Contemporary-Metaethics-Alex-Miller/dp/074562345X
April 3rd, 2009 - 20:13
the more humans migrate to a more digital median, the more they will be susceptible to complex and meaningful memes. i am positive that internet involvement has made users more intelligent, open, skeptical, scientific, aware of themselves, and aware of the technologically improving world around them. the extent to which interacting (living, spending time) on the internet encourages the above characteristics i am unsure.
don’t count on your audience being the same people 20 years from now as they are now. on the other hand, don’t bank on it either..
for one to act against some future danger they must first perceive danger. before they asses a situation as dangerous, they must first imagine that situation. the internet does wonders for the imagination, with stories from all corners of the globe, and magnificent and powerful visual aides.
take this example, if you haven’t already seen it.
http://www.youtube.com/watch?v=_0dYPnui3rM&fmt=18
although goofy, its memorability is its asset. someone watches that, stores it away for whatever reason (‘scary’funny”cool’w/e), and then say someone seriously brings up the issue of nanotechnology’s civilization destroying potential. the person rakes their brain. what do they remember? oh, “People were talking about this way back then..?”
the presence of a memory on a subject adds weight to the validity and worth of considering seriously that subject. the exact specifics of that memory are irrelevant, so they can be as memetically rich as one can imagine. the communication, at this juncture, need not be serious nor informed. the internet excels in that department, meme’tizing information into more memorable states.
tldr more people are spending more time on the net and that changes dynamics of influence over abstract, dreadful, outrageous subjects. a good meme often carries itself, no marketing needed.
April 5th, 2009 - 06:22
Thanks for the reference, Toby
April 5th, 2009 - 15:04
It would stand to reason that the “right” morality can be introduced into AGI to make it friendly for the same reason that nanotech can be achieved: nature / evolution made it happen for us.
Now of course it might take some work to work it – especially to get it right.
Am I out of line if I suggest that AGI would be able to quickly read up on all of human writings, including on philosophy and morality, including these very discussions we are having on the subject of what it should be, making it possible for it to gain a sense of morality, preferably early on (ie before it has become Skynet)?