Teaching Embodied Agents, Applied to Virtual Animals in Second Life

 Posted by Jeriaska on December 8th, 2008

novamente_pet.jpg

 At AGI-08: The First Conference on Artificial General Intelligence, Novamente LLC CSO Ben Goertzel presented on a paper by Cassio Pennachin et al. on a teaching methodology called Imitative-Reinforcement-Corrective (IRC) learning, proposed as a general approach for teaching embodied non-linguistic AGI systems.  IRC  is a framework for automatically learning a procedure that generates a desired type of behavior.  A set of exemplars of the target behavior-type are utilized for fitness estimation, reinforcement signals from a human teacher are used for fitness evaluation, and the execution of candidate procedures may be modified by the teacher via corrections delivered in real-time.

The following transcript of the AGI-08 conference presentation “IRC Learning and the Novamente Cognition Engine” has not been approved by the speaker. Video is also available.

Teaching Embodied Agents, Applied to Virtual Animals in Second Life

pennachin_01.png

I am going to talk about some work we have been doing applying a simplified portion of our Novamente cognition engine AI system to the problem of controlling virtual agents in virtual worlds. We are taking a somewhat different approach from the previous talk in that rather than trying to code rules, using a logic engine to govern the high level functions, we are using more of a learning approach, working with virtual animals and doing more basic tasks, dealing with perception, action and social interaction.

pennachin_02.png

I am going to go through the first three or four minutes telling a little bit about the Novamente AI approach for those who might not be familiar with it.  I will talk about what we are doing with virtual pets.  For the last couple minutes, we will show a video of virtual pets in Second Life.  Most of you are at least slightly familiar with the Novamente Cognition Engine and in three minutes I am probably not going to enlighten you too much further.  It is an approach to artificial general intelligence which has a pretty large and well fleshed out design spec and architecture.  I believe it is capable of being taken all the way up to human-level AGI, though I am certainly not going to try and justfy that in the first three minutes of this talk.  I am merely going to overview some of the key aspects of architecture.

pennachin_03.png

In terms of knowledge representation we use nodes and links (a weighted, labeled hypergraph), which combines aspects of probabilistic, semantic networks with some aspects of attractor neural networks—we have some Hebbian learning in there.  It is an evolving graph with nodes and links.  Some of the nodes and links may represent very low level, specific things like perceptions (something in the world) or actions (trying to move a certain joint on a real or virtual robot).  Some of them are abstract concepts, while some of them are nodes that have no English-language correlate, formed by the system as combinations of other things.

pennachin_04.png pennachin_05.png

We have a container called the Atom Space, being a bunch of nodes and links.  A number of cognitive agents act on it, carrying out various more or less intelligent processes.  It is a networked architecture, so you have a whole bunch of these atom tables and mind agents working on them, running on a bunch of different machines.  In terms of the learning algorithms that live in these little mind agent boxes, that is really the deepest part.

pennachin_06.png

We have a probabilistic logic system called PLN, which Matt Ikle talked about one aspect of earlier today, in terms of how it deals with quantifiers.  Also, an automated program learning system called MOSES, which was developed primarily by Moshe Looks, and a mechanism using artificial economics for allocating attention among different nodes and links and different mind agents in the system.  The idea is that all of these things are supposed to work together, alleviating each others’ combinatorial explosions, allowing the system to carry out complex goals in complex environments in a scalable way.

pennachin_07.png

The cognitive architecture ultimately is not that different than the architecture underlying Stan Franklin‘s LIDA system or Nick Cassimatis‘s Polyscheme.  It is based on modern ideas in cognitive science.  The methodology we are using at the present time is based on social, embodied interactive learning, and we are now focusing on virtual worlds for many of the same reasons Sibley outlined in the previous talk.

pennachin_08.png

I don’t want to dwell on this, but this is an architecture diagram somewhat similar to the LIDA diagram that was shown before.  We have perception coming in from a simulation environment, nodes and links and mind agents concerned with sensory processing. Then there is an active memory that combines short term and long-term memory dynamically.  You generally have a dynamic where perceptions come in, they go into memory, they get thought about, they generate actions and do things in the world, and you keep looping around.

pennachin_09.png

The Novamente Cognition Engine, which I have been talking about, is a pretty detailed and fleshed out design within this general approach.  As I’ve mentioned, we are also working on an open source  project called OpenCog, which uses Novamente’s knowledge representation and overall software framework, but does not make strict commitments about the learning algorithms.  In OpenCog, you can plug in any different learning algorithms to our knowledge representation architecture and see what happens.  It will be launched later this year.

pennachin_10.png

Why do I think this might work?  Of course, you never know until you are there. We don’t have a human-level thinking machine yet. It could all fail for some reason that I’m not seeing yet. I put a decade of work into the mathematical and conceptual theory of mind underlying the Novamente system. For the last seven years since the Novamente system was founded, I have been working with a great team of software engineers. We are working on a really scalable, robust software framework underlying the thing. It certainly has been well thought through and there are at least not obvious mistakes there. We will see how it unfolds as we experiment with it over the coming years.

pennachin_11.png

Leading into the work on virtual pets, I like to think of our work on Novamente in terms of Piaget’s stages of cognitive development.  You start out at an infantile stage, where the system is just kind of fiddling around with sensation and actuation in the world.  Then you go on through more and more abstract stages, such as a concrete operational stage where there is a richer variety of learned mental representations.  Then finally, formal abstraction, where the system can control its logical operations in a way that is guided by its own experiences, effectively learning complex things.  Ultimately, you get to a stage where the system can understand and control all of its own structures and dynamics, and self-modify in a way that human beings are not fully able to because we lack that full introspection and control over our own mind states.

pennachin_12.png

A more fanciful way of looking at the Piagetan hierarchy is to start out with a low level, with babies or with animals.  They don’t need to have a deep formal understanding of things.  Then you can progress up from simple animals, to animals that understand language, virtual babies, virtual adults, and of course that’s not the limit.  We can go even beyond the human level and beyond that to some intelligence that is incomprehensible to us altogether.

pennachin_15.png

Now, getting to the practical part of the talk, we will be using a subset of this overall architecture to control pets in virtual worlds.  One of the papers that I think is going to be seen as one of the more important ones in the history of AGI when we look back is by John Laird.  The paper is “Human-Level AI’s Killer Application: Interactive Computer Games.”  This was in 2000, and I think it is going to prove prescient.  It has not really come to life yet, but with the growth rate of games and online virtual worlds, I think this is the domain where AI and AGI are going to first really flourish in a powerful way.

pennachin_16.png pennachin_17.pngpennachin_18.png

On that same theme, if you look at animal-level AI as opposed to human-level AI, it seems to me that virtual pets, rather than AIBOs, are going to have the biggest impact the most quickly for many of the reasons that Sibley outlined.  Just a quick overview of the universe of virtual pets, virtual worlds are just running rampant. A quarter of the 34 million kids and teens on the web in the US now are visiting virtual worlds.  It is growing really fast.  There are more and more virtual worlds coming out on mobile phones and the internet.  You have almost half a million people joining virtual worlds now.

pennachin_19.png pennachin_20.pngpennachin_21.pngpennachin_22.png

There are virtual pets all over the place, mostly aimed at children: Neopets, Gopets, Club Penguin.  QQ Pets in China has 55 million users.  If any of you has children, you have probably seen these pets videogames on your TV screen or Nintendo DS.  There are pets on mobile phones—Nokia has them.  DOCOMO is launching them.  World of Warcraft is filled with pets.

What the problem is with these pets is that they’re pretty much all morons.  Nintendogs, which my daughter loves, have twenty-six tricks, and that’s all they can do.  Once you have taught them all twenty-six tricks, that’s it.  They can’t learn anything else.  They are rigidly programmed, they don’t respond emotionally in a flexible way, they lack personality, and they lack the ability to genuinely learn.  What is called learning in the virtual pets is just kind of unlearning the stupidity that is wired into them.

pennachin_25.png

How do you build a better pet brain?  You want the pets to genuinely learn, respond to the environment so that users and pets can bond with them more thoroughly.  That is what we have been looking at with building the Novamente pet brain, making pets that respond to and interact with objects, creatures and avatars so that if you have a virtual dog in a virtual world and a cat comes by, it will react to it. It may chase it.  You want to teach it that when you throw a stick, it should fetch it and bring it back.  It can then learn to go fetch attractive women and bring them back, even though that was never taught to it in advance.  You taught it by example and by reinforcement.

pennachin_26.pngpennachin_27.pngpennachin_28.png

I’m going to show video of a virtual dog in Second Life being taught to fetch and doing simple tricks like playing soccer by the user showing something and the pet trying to copy it.  We are looking at three kinds of learning here. They are imitative learning, where the teacher acts out a behavior, showing the student, in this case a dog, by example.  Reinforcement learning is more like dogs are normally trained.  The dog does something and you reward it.  Generally positive reinforcement works better than negative, but you can give both.  Then there is what is called corrective learning, where if you want to get your dog to sit, you just push its butt down on the ground.  It will learn that that is what it is supposed to do, because you’re guiding it.

pennachin_29.pngpennachin_30.pngpennachin_31.png

The combination of these three kinds of learning seems to be quite powerful.  You can also teach with a partner, which is a way of teaching parrots things.  A partner acts out the behavior and the dog tries to copy what the partner was doing.  The architecture we use for this is a simplified version of the big Novamente architecture diagram we saw before, where you have a component that controls the pets.  We have a store of the collective memory of all the pets, so each pet can to some extent benefit from what was taught to the other ones.  We have learning servers which use MOSES learning algorithms to learn new behaviors based on experience.  For the next generation of the pet architecture, we are working on integrating some more fun stuff like the probabilistic logic engine that Matt has been working on, some natural language processing to help the pets more fully understand commands expressed in more complex English.

pennachin_32.pngpennachin_33.png

The next step, which I don’t have time to talk about is going beyond virtual pets with sensory motor behaviors to pets that exercise simple linguistic ability.  I really think this is where it’s at.  Once you get to the point where you have virtual agents in virtual worlds, you are teaching them language and they are learning to communicate better and better, I think that is where we are going to see an incredible exponential increase in the intelligence of our virtually embodied AGI systems.

novamente_movie.png

What we are going to see here is another Second Life movie of our prototype system and what it can do with a virtual dog.  This guy is human-controlled.  He is showing the dog how to fetch with a two-teacher method.  He is going to have his girlfriend play fetch for him, and then the dog is supposed to basically play the role of the girlfriend and learn to fetch like a dog does.  This is imitative learning.  Right now the language processing is very simple.  It can only handle some basic template phrases.  We have a fairly complex language engine that will integrate with this later this year.  The animation needs some work, as you can see, but this is an early stage prototype.  There is actually a lot more that we can do than that.  I hope that this hints at the ability for sensory motor integrative learning you can have in a virtual world environment.

agi-08_logo.png

Leave a Reply