Timezone: »
Rapid simultaneous advances in machine vision and cognitive neuroimaging present an unparalleled opportunity to assess the current state of artificial models of the human visual system. Here, we perform a large-scale benchmarking analysis of 72 modern deep neural network models to characterize with robust statistical power how differences in architecture and training task contribute to the prediction of human fMRI activity across 16 distinct regions of the human visual system. We find: one, that even stark architectural differences (e.g. the absence of convolution in transformers and MLP-mixers) have very little consequence in emergent fits to brain data; two, that differences in task have clear effects--with categorization and self-supervised models showing relatively stronger brain predictivity across the board; three, that feature reweighting leads to substantial improvements in brain predictivity, without overfitting -- yielding model-to-brain regression weights that generalize at the same level of predictivity to brain responses over 1000s of new images. Broadly, this work presents a lay-of-the-land for the emergent correspondences between the feature spaces of modern deep neural network models and the representational structure inherent to the human visual system.
Author Information
Colin Conwell (Harvard University)
Jacob Prince (Harvard University)
George Alvarez (Harvard University)
Talia Konkle (Harvard University)
More from the Same Authors
-
2022 : Towards Disentangling the Roles of Vision & Language in Aesthetic Experience with Multimodal DNNs »
Colin Conwell · Christopher Hamblin -
2023 Poster: Cognitive Steering in Deep Neural Networks via Long-Range Modulatory Feedback Connections »
Talia Konkle · George Alvarez -
2022 : The Perceptual Primacy of Feeling: Affectless machine vision models explain the majority of variance in visually evoked affect and aesthetics »
Colin Conwell -
2021 : Unsupervised Representation Learning Facilitates Human-like Spatial Reasoning »
Kaushik Lakshminarasimhan · Colin Conwell -
2021 : On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation »
Binxu Wang · David Mayo · Arturo Deza · Andrei Barbu · Colin Conwell -
2021 Poster: Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex »
Colin Conwell · David Mayo · Andrei Barbu · Michael Buice · George Alvarez · Boris Katz -
2019 : Panel Discussion: What sorts of cognitive or biological (architectural) inductive biases will be crucial for developing effective artificial intelligence? »
Irina Higgins · Talia Konkle · Matthias Bethge · Nikolaus Kriegeskorte -
2019 : Object representation in the human visual system »
Talia Konkle -
2019 : Poster Session »
Ethan Harris · Tom White · Oh Hyeon Choung · Takashi Shinozaki · Dipan Pal · Katherine L. Hermann · Judy Borowski · Camilo Fosco · Chaz Firestone · Vijay Veerabadran · Benjamin Lahner · Chaitanya Ryali · Fenil Doshi · Pulkit Singh · Sharon Zhou · Michel Besserve · Michael Chang · Anelise Newman · Mahesan Niranjan · Jonathon Hare · Daniela Mihai · Marios Savvides · Simon Kornblith · Christina M Funke · Aude Oliva · Virginia de Sa · Dmitry Krotov · Colin Conwell · George Alvarez · Alex Kolchinski · Shengjia Zhao · Mitchell Gordon · Michael Bernstein · Stefano Ermon · Arash Mehrjou · Bernhard Schölkopf · John Co-Reyes · Michael Janner · Jiajun Wu · Josh Tenenbaum · Sergey Levine · Yalda Mohsenzadeh · Zhenglong Zhou -
2009 Poster: Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model »
Edward Vul · Michael C Frank · George Alvarez · Josh Tenenbaum -
2009 Oral: Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model »
Edward Vul · Michael C Frank · George Alvarez · Josh Tenenbaum