How Might Probabilistic Inference Emerge from the Brain?

 Posted by Jeriaska on May 31st, 2008

probabilistic.png

At the AGI-08 Conference on Artificial General Intelligence, Ben Goertzel presented on a paper by the speaker and Cassio Pennachin, in which series of hypotheses is proposed, connecting neural structures and dynamics with the formal structures and processes of probabilistic logic. In this framework, Hebbian learning at the synaptic level would be expected to have the implicit consequence of probabilistic deduction at the logical statement level.


The following transcript of Ben Goertzel’s AGI-08 presentation “How Might Probabilistic Inference Emerge from the Brain?” has not been approved by the speaker. Video is also available.

How Might Probabilistic Inference Emerge from the Brain?

probable_1.png

As I emphasized in my opening talk for this conference, I think there are a lot of different approaches that can potentially lead to artificial general intelligence. The Novamente engine is the one I am working on now, which seems to me to have the best chance of getting to powerful AGI quickly, but I cannot resist my own brain exploring all sorts of different possibilities in what little spare time the Novamente project leaves me.

My wife Isabella as it happens is doing graduate work in cognitive neuroscience at the University of Maryland, College Park, so I have been immersed in the various neuroscience papers she brings back from school. I cannot help but think, how does the stuff we are doing with Novamente relate to what happens in the human brain and mammalian brains more generally? I certainly do not pretend to know the answer to that, but I found some time to pursue some speculations. Together with my collaborator Cassio Pennachin, who is the CTO of Novamente, we made some small simulations to explore some ideas about how some aspects of logical probabilistic reasoning can be made to emerge from neural net-type dynamics.

This is a direction to AGI I would be interested in exploring if I had not conceived of the Novamente design. There is potential to get logical operations and higher order thinking to emerge out of neural networks. I think in many ways it is a harder path than building a system like Novamente, but it is still an interesting one. What I am going to do now is define some ideas Cassio and I have had about how probabilistic inference might emerge from the brain. This is not to say that I think probabilistic inference is the only important thing in cognition or AI. It is a specific question that seemed interesting to me because logical reasoning seemed like one of the aspects of intelligence and of AI systems that is hardest to connect to what is going on in the brain.

If you take something like evolutionary learning, spreading activation networks, reinforcement learning, the connection with the brain is a bit better understood and better established. Building the connection between logic and the brain is a bit more speculative at this point. I think we have at least generated some creative ideas in that regard.

probable_2.png

The starting point of our speculation here is the hypothesis that cell assemblies are important. This of course goes back to Hebb in the ’40s–it is not a new idea. We start from the hypothesis that a mental concept is represented either as a set of cell assemblies–meaning a bunch of neurons that are interlinked together, often fire together, and have high reinforced weights of the synapses between them–or else, more complexly, a distinct temporal activation pattern associated with a particular set of cell assemblies. The same set of assemblies could have many concepts corresponding to different temporal patterns.

The first one is simpler. I do not know which of the two the brain uses. It could of course use something besides those two, but the logic of our argument works whichever one of those is the case, as we outline in the paper.

probable_3.png

We then look at the idea of a synaptic bundle. We have one assembly of neurons over here, another assembly of neurons over here, and you can have a whole bunch of synaptic links going between neurons here and neurons there. Then, each of these synapses, if you take a formal model of it–you can say there is a weight associated with that synapse–of course that weight boils down to a bunch of complex biology. There is not really a weight there, but it is a formal model. We take these abstractions of weights and if we have a bundle of these things with weights, we can say, “What is the composite of the bundle of synapses going from this group of assemblies to this one?” If there are actually temporal activation patterns, the math still works out, it is just the weight of these synapses on average when this temporal activation pattern is active.

probable_4.png

Then you can define the weight of the bundle implicitly by saying, “You have the mean activation here, you have the mean activation here a little bit later. How much of the activation here filters through to being activation here over time?” That lets you over time implicitly and statistically define the weight of this synaptic bundle. Is it excitatory, it is inhibitory, is it strong, is it weak? You can think about the weights of bundles from cell assemblies to cell assemblies. What is interesting there, you can use that to emergently get conditional probabilities.

Basically, what is the probability that this cell assembly is active, given that that cell assembly was active? Or the probability that this temporal activation pattern across a cell assembly is observed, given this temporal activation patterns across a cell assembly is observed? That gives you a conditional probability. If you buy that this guy denotes something and this guy denotes something–for example, responding to some observed stimulus in the visual field–then you can draw a conditional probability there.

probable_5.png

On the formal level, it is interesting to connect that with a simple inference diagram that you draw in uncertain term logic. If you look at a term logic-based inference system like Novamente’s probabilistic logic network that my colleague Matt Ikle presented on yesterday, or Pei Wang‘s NARS system, we have systems of inference that look at trans inference. S implies M, M implies P, S implies P. You can go around the loop in different ways. In probabilistic term logic, that comes out from combining transitive inference with Bayes’ rule.

What is interesting, on the face of it, you can say maybe that’s an assembly, that’s an assembly, and that’s an assembly. These are synaptic bundles, and does it work out to say that if you have a high weight connection from this assembly to this assembly, and a high weight connection from this assembly to this assembly, does the brain learn through a whole bunch of micro-level instances of Hebbian learning on the neuron-to-neuron level to make a connection there? That was the essential idea we had and we coded up some simple simulations we had in Ruby’s to see if it would work in some special cases. There seems to be some meaning there.

We have not done a real simulation study trying to compare it to a real part of the brain carrying out a specific action, which is an important project.

probable_6.pngprobable_7.png

The basic idea, if you think in terms of Bayes’ rule, activity here implies activity here in terms of a synaptic bundle. Then, will the brain learn that activity here should lead to activity there by the collective action of Hebbian learning in a bunch of neuron-to-neuron connections? You certainly don’t get exact mathematics, but you get a rough correspondence with what Bayes rule would say. We are not that good in many contexts at doing exact probabilistic inference either.

probable_8.png

Similarly, if you have three guys, they can be going around in a cycle in various ways. This is where an interesting topic came up, which is where we spent most of our work in thinking about this project. I am not going to talk about it too long because I only have a couple minutes left. In a probabilistic logic engine like PLN and NARS, you have a difficulty with circular reasoning. If you have A implies B, B implies C, A implies C, you can go around the circle in any different way.

In a probabilistic logic system, it is often hard to stop from there being a kind of truth value pollution, where small errors in the inference just compound and everything becomes nonsense. You are just inferring in every direction constantly, over and over again. There is a mechanism called inference trails in PLN and NARS that lets you deal with that, where basically each link contains a list of the other links that were used to infer it, and you disallow circularity. If these guys were used to give rise to this, then you do not use this to adjust the truth value of that. Using inference trails lets you have a bunch of little directed acyclic graphs inside a looping network, thus avoiding the need to make a big directed acyclic graph like you have in a Bayes net.

The brain does not seem to have any of that. The brain is not a directed acyclic graph. It is all tangled up. The brain also has no inference trails. Each synapse does not list in some data structure all the other synapses that led to its synaptic modification. The question is, “What happens?”

What is interesting when you play with simple simulations of this stuff is that attention allocation comes to the rescue. You can set up a system like this, and if the neurons are not getting stimulated except when new data is coming into the system, you let the entire thing relax non-linearly. Most neurons are not doing something all the time, they are only doing something when they are stimulated, so the system does not want to spend its resources going around in circles doing inference constantly. Then, the fact that there is thresholds on neurons, so everything is not always firing and doing something; there is some limitation. There are refractory periods in the neuron, where after they fire they do not do anything for awhile.

The combination of these things means that when we actually played with the system, we did not get this horrible descent into circular inference and contradiction. You do get some error that way, but there are so many other errors in this framework that this is not even the biggest error. You can get a rough correspondence in a network like this with the results of simple probabilistic deduction and Bayes’ rule. Under the regimes that you naturally would operate the system under, you actually would not get the circular inference. You do if you flood the system with input constantly, but that is not what happens when you take a neural network and have it interact with the world. You have data coming in when the system needs it.

The last thing here, which I will go over very quickly, is another speculation piled onto all these previous speculations. Simple logic is nice, transitive inference and Bayes’ rule, but how about all the rest of logic? What do we do with quantifiers, Skolemization, and x, y, z. That seems a whole added order of complexity. I thought in the back of my mind for about two years about how the brain could possibly ever do stuff like that in some way that is not a totally lame explanation, like saying the brain is the operating system and there is some other program implemented on top of it. Is there any way to look at these higher-order logic things as coming naturally out of what we know the brain to do?

The best idea I had so far goes back to the notion of higher-order functions and combinatory logic. The notion that the brain maintains variables and lambdas seemed out there to me. But if you look at the insides of a functional programming interpreter like Haskell, you can do stuff with no variables. You have variable-free logic, Turing-complete systems with no variables. What you have are higher-order functions, combinators, functions that take functions as arguments that take functions that take functions as arguments. The math works out that you can do without variables by manipulating things in a tricky way. If you are not familiar with this branch of math, I have no chance of conveying it in the next minute.

probable_9.png

What is interesting is that if you look at how neural networks could work, there is no obstacle to implementing that higher-order system. What you need is basically a neural net that looks at the internal connections as well as the input and output of a neural subnet. It is actually acting on a whole neural net as its input, including the internals of it and the input and output.

If you look at a neural subnet, it kind of serves as a router and basically acts like one or another neural net depending on the control signal. You can say this neural net takes the internal state and input and output of this guy as its input, it gives some output, and based on the output it is processed in some other way, flips a switch inside, which then causes it to emulate another guy internally.

The logic of this is laid out in the paper. It is kind of complex recursive thinking. Ultimately, by having a neural net acting on another neural net and measuring its whole state, you can implement arbitrarily complex higher-order functions using neural nets that measure each other. You can represent anything in combinatory logic. If you get a Turing complete-system, you do not need variables. The combination of that with the simpler probabilistic stuff I discusses earlier in the talk is in the very least a tantalizing notion for how you might get this complex stuff out of what the brain does.

To make an AGI system out of all this, there is a heck of a lot of intermediate steps that I do not know what to do with nearly as well as I do with Novamente. So I am not really pursuing this actively, but I think it is an interesting area for investigation.

ben_goertzel_info.png


Leave a Reply