L G Sanchez Giraldo, O Schwartz. Flexible normalization in deep convolutional 
neural networks. Computational and Systems Neuroscience (Cosyne) abstract, 2017.

Spatial context effects are prevalent in visual neural processing and in perception. 
In primary visual cortex, neural responses are modulated by stimuli spatially surrounding 
the classical receptive field in a rich way. These effects have been modelled with divisive 
normalization approaches. From a scene statistics perspective, spatial normalization may 
be seen as a nonlinear operation that reduces statistical dependencies between filter 
activations to scenes. Recent models propose that spatial normalization is not fixed, and 
that normalization is recruited only to the degree that filter activations in center and 
surround locations are deemed statistically dependent. However, extending these studies 
to understanding when normalization is recruited in mid-level visual areas, such as secondary 
visual cortex, has been so far an elusive problem. This may partly be due to a lack of 
understanding of what might be the optimal stimulus space or filter set. In this work, we 
propose to use deep convolutional networks (CNNs) to study flexible normalization mechanisms. 
CNNs have shown intriguing similarities to visual cortex, and they provide a potentially 
tractable way to obtain representations at mid-levels of a visual processing hierarchy. We 
introduce a computational model that incorporates flexible normalization into the second 
layer of the network. This model is able to capture non-trivial spatial dependencies among 
mid-level features such as textures and more geometric tiling of the space. This approach 
makes predictions about when spatial normalization might be recruited in mid-level areas. In 
addition, it has been shown that performance of deep CNNs on supervised tasks have benefited 
from incorporating nonlinearities such as local contrast normalization, which are inspired 
by neuroscience. Our work suggests that flexible normalization has further potential to 
improve the performance of deep CNNs for visual recognition tasks.