Timezone: »
Exploring the perceptual straightness of adversarially robust and biologically-inspired visual representations
Anne Harrington · Vasha DuTell · Ayush Tewari · Mark Hamilton · Simon Stent · Ruth Rosenholtz · Bill Freeman
Event URL: https://openreview.net/forum?id=A8ucsSFEAqS »
Humans have been shown to use a ''straightened'' encoding to represent the natural visual world as it evolves in time (H\'enaff et al.~2019). In the context of discrete video sequences, ''straightened'' means that changes between frames follow a more linear path in representation space at progressively deeper levels of processing. While deep convolutional networks are often proposed as models of human visual processing, many do not straighten natural videos. In this paper, we explore the relationship between robustness, biologically-inspired filtering mechanisms, and representational straightness in neural networks in response to time-varying input, and identify curvature as a useful way of evaluating neural network representations. We find that $(1)$ adversarial training leads to straighter representations in both CNN and transformer-based architectures and $(2)$ biologically-inspired elements increase straightness in the early stages of a network, but do not guarantee increased straightness in downstream layers of CNNs. Our results suggest that constraints like adversarial robustness bring computer vision models closer to human vision, but when incorporating biological mechanisms such as V1 filtering, additional modifications are needed to more fully align human and machine representations.
Humans have been shown to use a ''straightened'' encoding to represent the natural visual world as it evolves in time (H\'enaff et al.~2019). In the context of discrete video sequences, ''straightened'' means that changes between frames follow a more linear path in representation space at progressively deeper levels of processing. While deep convolutional networks are often proposed as models of human visual processing, many do not straighten natural videos. In this paper, we explore the relationship between robustness, biologically-inspired filtering mechanisms, and representational straightness in neural networks in response to time-varying input, and identify curvature as a useful way of evaluating neural network representations. We find that $(1)$ adversarial training leads to straighter representations in both CNN and transformer-based architectures and $(2)$ biologically-inspired elements increase straightness in the early stages of a network, but do not guarantee increased straightness in downstream layers of CNNs. Our results suggest that constraints like adversarial robustness bring computer vision models closer to human vision, but when incorporating biological mechanisms such as V1 filtering, additional modifications are needed to more fully align human and machine representations.
Author Information
Anne Harrington (Massachusetts Institute of Technology)
Vasha DuTell (Massachusetts Institute of Technology)
Ayush Tewari (Massachusetts Institute of Technology)
Mark Hamilton (Microsoft and MIT)
Simon Stent (Toyota Research Institute)
Ruth Rosenholtz (Massachusetts Institute of Technology)
Bill Freeman (Massachusetts Institute of Technology)
More from the Same Authors
-
2022 : CW-ERM: Improving Autonomous Driving Planning with Closed-loop Weighted Empirical Risk Minimization »
Eesha Kumar · Yiming Zhang · Stefano Pini · Simon Stent · Ana Sofia Rufino Ferreira · Sergey Zagoruyko · Christian Perone -
2021 : Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks »
Anne Harrington · Arturo Deza -
2020 Demonstration: MosAIc: Finding Artistic Connections across Culture with Conditional Image Retrieval »
Mark Hamilton · Stephanie Fu · Mindren Lu · Johnny Bui · Margaret Wang · Felix Tran · Marina Rogers · Darius Bopp · Christopher Hoder · Lei Zhang · Bill Freeman