Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.
Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and non-human primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognizing objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognize objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.
European Commission (604102)