The “whole lobster” model

It’s easy to really believe that feeding data into increasingly complex models of deep learning is akin to AI. I’ve found more than one techie that believes that AI is anywhere near cognition, or will be in the future. This is very far from the truth.

Neural nets in “artificial intelligence”, as they stand today, are:

AI’ll be far away from the kind of generalized intelligence we expect from animals as long as sensorial input, neuronal spiking and other biologigy-like systems are not modelled. I’d be happy with a synthetic C. Elegans, really. I mean, come on, it’s a three-hundred neuron worm. Who knows, maybe it’s getting there

Synapsis, Q-learning, bayes and morphological variation

I’ve been thinking about this for quite some time. I recently learned about Q-learning, basically fancy (if brute-force) reinforcement learning by rewards. Some people have been integrating Q-learning with neural networks (of the deep learning, synthetic kind), but I’m afraid this is not really what I was speaking above. Don’t misunderstand me: it’s still very interesting, but I don’t think it’s fundamentally different from what’s been done so far.

I think the approach to q-learning is fundamentally wrong. As it stands, q-learning is mapped to a specific behavior. Take, for example, a tic-toc game, with simple delineated rules: the rewards of the q-learning paradigm will be directed to optimizing the actions that lead to a win.

Now, there are some problems with that. First, the number of possible states in a game can become humongous very quickly, even with basic environments. If I understood well, in a simple tabular grid (n x n) q-learning is trying to explore the posible states of way to many options - yes, it’ll soon optimize a path or a way to win whatever game you give it, but it’s a bite too brute-forcey and only makes sense in simulated environments where the set of factors to take into account is very reduced compared to, you know, the flesh world.

So, of course, people are combining it with neural networks, so that q-values are determined by neural networks. Note, however, that behaviors still arise from q-learning, rather than by the interaction of q-learning states.

Let me explain this in more detail: consider how a simple brain works. A brain is composed of various neurons, synapsys and neurotransmitters. Voltage, or neuronal spiking, causes the exchange of neurotransmissions and the inhibition or potentiation of a certain neuron. Neurons that spike together also cluster together functionally, and depending on the neurotransmitter receptors some connections are reinforced.

In in silico computation, all possible modifications in a model mimicking this structure are in the number, nature or properties of the neurons. These neurons can then adapt their behavior to adapt to a certain desired outcome. One such adaptation is the kind of error-minimization that is seeked by a regular regression, for example, but the basic definition can cover more sophisticated models of so called “artificial intelligente”, such as classification algorhithms. That is, one can adapt the system to conform to solving to the specific problem.

A far more interesting possibility is to adapt this kind of adaptability (should I say plasticity?) to the individual units of computation. That’s something similar to the Artificial Neural Network paradigm we are seeing everywhere, where the functions of neurons are optimized (so the problem is reduced to more minute features, and so the training can capture more and more nuance of the data). Most generative AI algorhythms work like that, if I’m not mistaken. Impressive, but still domain-specific.

How about the following: you establish a self-assembling dinamic system. You give it an innate (determined) necessity, and then you model not a series of neurons that solve the problem, but a series of neurons that adapt themselves and make connections arising from the very properties of the system. That’s, of course, far harder, yet, again, a three-hundred neuron worm brain can have an ok range of behaviours; maybe nothing like playing go, of course, but it’s closer to the generalization of a regular brain than the domain specificity of our computers.

Should I mention also that in terms of energy biology is also ridiculously efficient? Since you don’t have to have a fully connected network of neurons, but only use those reinforced, you can manage with much less energy.

Theoretically (and, I know, easier said to be done), this is what this means in terms of engineering: [1] You have a more or less stable hardware setting (since you can’t work on generations as evolution does, so hardware modifications aren’t an option unless you implement them). A model of the world can be tempting, but maybe not the best as it will never be as information-rich as the actual world so it won’t generalize easily. [2] A “brain” consisting on: a way to act on the world, a way to receive information from the world, a computational device that results in a potential action mechanism and a perception mechanism. This is where behaviour modification can happen and improve, evolution-like. [3] On top of that, a future-facing predictive model that self-actualizes prunning and reinforcing paths that have led to a certain goal.

Anyway, that’s my rambling about how biology should inform computation. Way easier said than done - but let me dream.