Timezone: »

 
Implementing Divisive Normalization in CNNs Improves Robustness to Common Image Corruptions
Andrew Cirincione · Reginald Verrier · Artiom Bic · Stephanie Olaiya · James J DiCarlo · Lawrence Udeigwe · Tiago Marques
Event URL: https://openreview.net/forum?id=KAAbo44qhJV »

Some convolutional neural networks (CNNs) have achieved state-of-the-art performance in object classification. However, they often fail to generalize to images perturbed with different types of common corruptions, impairing their deployment in real-world scenarios. Recent studies have shown that more closely mimicking biological vision in early areas such as the primary visual cortex (V1) can lead to some improvements in robustness. Here, we extended this approach and introduced at the V1 stage of a biologically-inspired CNN a layer inspired by the neuroscientific model of divisive normalization, which has been widely used to model activity in early vision. This new model family, the VOneNetDN, when compared to the standard base model maintained clean accuracy (relative accuracy of 99%) while greatly improving its robustness to common image corruptions (relative gain of 18%). The VOneNetDN showed a better alignment to primate V1 for some (contrast and surround modulation) but not all response properties when compared to the model without divisive normalization. These results serve as further evidence that neuroscience can still contribute to progress in computer vision.

Author Information

Andrew Cirincione
Reginald Verrier
Artiom Bic (Columbia University)
Stephanie Olaiya (Brown University)
James J DiCarlo (Massachusetts Institute of Technology)

Prof. DiCarlo received his Ph.D. in biomedical engineering and his M.D. from Johns Hopkins in 1998, and did his postdoctoral training in primate visual neurophysiology at Baylor College of Medicine. He joined the MIT faculty in 2002. He is a Sloan Fellow, a Pew Scholar, and a McKnight Scholar. His lab’s research goal is a computational understanding of the brain mechanisms that underlie object recognition. They use large-scale neurophysiology, brain imaging, optogenetic methods, and high-throughput computational simulations to understand how the primate ventral visual stream is able to untangle object identity from other latent image variables such as object position, scale, and pose. They have shown that populations of neurons at the highest cortical visual processing stage (IT) rapidly convey explicit representations of object identity, and that this ability is reshaped by natural visual experience. They have also shown how visual recognition tests can be used to discover new, high-performing bio-inspired algorithms. This understanding may inspire new machine vision systems, new neural prosthetics, and a foundation for understanding how high-level visual representation is altered in conditions such as agnosia, autism and dyslexia.

Lawrence Udeigwe (Manhattan College & MIT)
Lawrence Udeigwe

Dr. Lawrence Udeigwe is an Associate Professor of Mathematics at Manhattan College and a 2021/22 MLK Visiting Associate Professor in Brain and Cognitive Sciences at MIT. His research interests include: use of differential equations to understand the dynamical interaction between Hebbian plasticity and homeostatic plasticity; use of artificial neural networks (ANN) to investigate the mechanisms behind surround suppression and other vision normalization processes; and exploring the practical and philosophical implications of the use of theory in neuroscience. Dr. Udeigwe obtained his PhD from the University of Pittsburgh in 2014 under the supervision of Bard Ermentrout and Paul Munro.

Tiago Marques (MIT)

More from the Same Authors