The brain represents the world in particular ways. Here are a few:
1. The visual world on the retina
The retina is thought to whiten images, or transform them so that they always have roughly the same average, maximum and minimum (so that you can see in very bright and very dark environments. This was originally shown very nicely in two papers from Atick and Redlich (1990, 1992). Essentially, you want to smooth the visual scene around each point depending on the noise. You get receptive fields that look something like this:
Or more generally this:
A denoising autoencoder – a network that tries to replicate a corrupted image which smooths locally – has neural representations that look similar:
2. The visual world in first order visual cortex
Similarly, if you want to efficiently represent the visual world (once it is denoised) you want to represent things sparsely or independently. This was shown by Olshausen and Field 1996 and Bell and Sejnowski 1997 and is equivalent to doing ICA on natural images. Note that doing PCA on natural images will give you Fourier components.
If you train a Deep Network on ImageNet (AlexNet example below), the filters on the first layer look similar:
3. The auditory world
The best representation of the auditory world is also efficiently encoded. Lewicki 2002 show that if you run ICA on acoustic data you get filters that look like nearly identical to the sounds neurons respond to (wavelet basis functions).
I have not seen a visualization of the first few layers of a neural network that classifies speech (for instance) but I would guarantee it has features that look like wavelets.
4. Spatial cells
Your sense of place in the world is encoded by a series of grid cells – which are a periodic representation of place – and place cells, which are precise locations in space. Dordek et al 2016 showed that non-negative PCA on place cells will give you grid cells. This is similar to the result that PCA on images gives you Fourier components. Note that Dordek et al also use a single-layer feedforward neural network and show that it has a similar property.
It turns out if you train a Deep recurrent network on network navigation task, you get grid cells (once you have assumed place cells).
What else is left? Olfaction is a mess and doesn’t have a coherent coding principle as far as I can tell (the olfactory space is not clearly defined). Mechanosensation (touch) has been hard to define but Zhao et al 2017 can find first-order touch receptive fields with an autoencoder (like with vision). You can get CPGs (oscillatory movement generators) with recurrent neural networks by training an input signal to be associated with a particular sequence of movements. I’m struggling to think of other internal representations that are well understood.
A long-term principle in neuroscience has been that successive layers of the brain are attempting to decorrelate their responses to produce ever-finer features. Tishby and Zaslavsky 2015 suggest that a similar principle applies to Deep Networks: you have a constrained input output and networks are trying to find the representations that encode the most information between input and output given the limited bandwidth that they have (numbers of layers, numbers of units). It should not be surprising that this entails something like different forms of PCA or ICA or other signal-detection framework.
One of the nice things about Deep Networks is that you do not have to explicitly code for this in order to find these features – they are costless in a way. You can train for a particular task – a visually-driven one, a path-driven one, an acoustic-driven one – and these features will just fall out. Not only will these features fall out, but neurons which are deeper in the pathway will also have similar activity. This is a much harder problem and one in which “run PCA again” or “run ICA again” will not give a good answer to.
What other neural representations have we not yet seen in neural networks?
Can we cross out foveated vision?
Shameless Plug: http://bair.berkeley.edu/blog/2017/11/09/learn-to-attend-fovea/
Oh that’s cool, I hadn’t seen that before. Yes, I’d count that! The spatial representation emerges out of the task
Pingback: Deep Convolutional Neural Networks as Models of the Visual System: Q&A | Neurdiness