A review was published this week in Neuron by DeepMind luminary Demis Hassibis and colleagues about Neuroscience-inspired Artificial Intelligence. As one would expect from a journal called Neuron, the article was pretty positive about the use of neurons!
There have been two key concepts from neuroscience that are ubiquitous in the AI field today: Deep Learning and Reinforcement Learning. Both are very direct descendants of research from the neuroscience community. In fact, saying that Deep Learning is an outgrowth of neuroscience obscures the amount of influence neuroscience has had. It did not just gift the idea of connecting of artificial neurons together to build a fictive brain, but much more technical ideas such as convolutional neural networks that use a single function repeatedly across its input as the retina or visual cortex does; hierarchical processing in the way the brain goes from layer to layer; divisive normalization as a way to keep outputs within a reasonable and useful range. Similarly, Reinforcement Learning and all its variants have continued to expand and be developed by the cognitive community.
Sounds great! So what about more recent inspirations? Here, Hassibis &co offer up the roles of attention, episodic memory, working memory, and ‘continual learning’. But reading this, I became less inspired than morose (see this thread). Why? Well look at the example of attention. Attention comes in many forms: automatic, voluntary, bottom-up, top-down, executive, spatial, feature-based, objected-based, and more. It sometimes means a sharpening of the collection of things a neuron responds to, so instead of being active in response to an edge oriented, this, that, or another way, it only is active when it sees an edge oriented that way. But it sometimes means a narrowing of the area in space that it responds to. Sometimes responses between neurons become more diverse (decorrelated).
But this is not really how ‘attention’ works in deep networks. All of these examples seem primarily motivated by the underlying psychology, not the biological implementation. Which is fine! But does that mean that the biology has nothing to teach us? Even at best, I am not expecting Deep Networks to converge precisely to mammalian-based neural networks, nor that everything the brain does should be useful to AI.
This leads to some normative questions: why hasn’t neuroscience contributed more, especially to Deep Learning? And should we even expect it to?
It could just be that the flow of information from neuroscience to AI is too weak. It’s not exactly like there’s a great list of “here are all the equations that describe how we think the brain works”. If you wanted to use a more nitty-gritty implementation of attention, where would you turn? Scholarpedia? What if someone wants to move step-by-step through all the ways that visual attention contributes to visual processing? How would they do it? Answer: they would become a neuroscientist. Which doesn’t really help, time-wise. But maybe, slowly over time, these two fields will be more integrated.
More to the point, why even try? AI and neuroscience are two very different fields; one is an engineering discipline of, “how do we get this to work” and the other a scientific discipline of “why does this work”. Who is to say that anything we learn from neuroscience would even be relevant to AI? Animals are bags of meat that have a nervous system trying to solve all sorts of problems (like wiring length energy costs between neurons, physical transmission delays, the need to blood osmolality, etc) that AI has no real interest or need in including but may be fundamental to how the nervous system has evolved. Is the brain the bird to AI’s airplane, accomplishing the same job but engineered in a totally different way?
Then in the middle of writing this, a tweet came through my feed that made me think I had a lot of this wrong (I also realized I had become too fixated on ‘the present’ section of their paper and less on ‘the past’ which is only a few years old anyway).
The ‘best paper’ award at the CVPR 2017 conference went to this paper which connects blocks of layers together, passing forward information from one to the next.
That looks a lot more like what cortex looks like! Though obviously sensory systems in biology are a bit more complicated:
And the advantages? “DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters”
So are the other features of cortex useful in some way? How? How do we have to implement them to make them useful? What are the drawbacks?
Neuroscience is big and unwieldy, spanning a huge number of different fields. But most of these fields are trying to solve exactly the same problem that Deep Learning is trying to solve in very similar ways. This is an incredibly exciting opportunity – a lot of Deep Learning is essentially applied theoretical neuroscience. Which of our hypotheses about why we have attention are true? Which are useless?