Timezone: »
To unveil how the brain learns, ongoing work seeks biologically-plausible approximations of gradient descent algorithms for training recurrent neural networks (RNNs). Yet, beyond task accuracy, it is unclear if such learning rules converge to solutions that exhibit different levels of generalization than their non-biologically-plausible counterparts. Leveraging results from deep learning theory based on loss landscape curvature, we ask: how do biologically-plausible gradient approximations affect generalization? We first demonstrate that state-of-the-art biologically-plausible learning rules for training RNNs exhibit worse and more variable generalization performance compared to their machine learning counterparts that follow the true gradient more closely. Next, we verify that such generalization performance is correlated significantly with loss landscape curvature, and we show that biologically-plausible learning rules tend to approach high-curvature regions in synaptic weight space. Using tools from dynamical systems, we derive theoretical arguments and present a theorem explaining this phenomenon. This predicts our numerical results, and explains why biologically-plausible rules lead to worse and more variable generalization properties. Finally, we suggest potential remedies that could be used by the brain to mitigate this effect. To our knowledge, our analysis is the first to identify the reason for this generalization gap between artificial and biologically-plausible learning rules, which can help guide future investigations into how the brain learns solutions that generalize.
Author Information
Yuhan Helena Liu (University of Washington)
Arna Ghosh (McGill University/ Mila/ Meta)
Blake Richards (Mila/McGill)
Eric Shea-Brown (University of Washington)
Guillaume Lajoie (Mila, Université de Montréal)
More from the Same Authors
-
2021 Spotlight: The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning »
Shahab Bakhtiari · Patrick Mineault · Timothy Lillicrap · Christopher Pack · Blake Richards -
2021 Spotlight: Your head is there to move you around: Goal-driven models of the primate dorsal pathway »
Patrick Mineault · Shahab Bakhtiari · Blake Richards · Christopher Pack -
2021 : Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning »
Nan Rosemary Ke · Aniket Didolkar · Sarthak Mittal · Anirudh Goyal · Guillaume Lajoie · Stefan Bauer · Danilo Jimenez Rezende · Yoshua Bengio · Chris Pal · Michael Mozer -
2022 : Deceiving the CKA Similarity Measure in Deep Learning »
MohammadReza Davari · Stefan Horoi · Amine Natik · Guillaume Lajoie · Guy Wolf · Eugene Belilovsky -
2022 Spotlight: Lightning Talks 2A-4 »
Sarthak Mittal · Richard Grumitt · Zuoyu Yan · Lihao Wang · Dongsheng Wang · Alexander Korotin · Jiangxin Sun · Ankit Gupta · Vage Egiazarian · Tengfei Ma · Yi Zhou · Yishi Xu · Albert Gu · Biwei Dai · Chunyu Wang · Yoshua Bengio · Uros Seljak · Miaoge Li · Guillaume Lajoie · Yiqun Wang · Liangcai Gao · Lingxiao Li · Jonathan Berant · Huang Hu · Xiaoqing Zheng · Zhibin Duan · Hanjiang Lai · Evgeny Burnaev · Zhi Tang · Zhi Jin · Xuanjing Huang · Chaojie Wang · Yusu Wang · Jian-Fang Hu · Bo Chen · Chao Chen · Hao Zhou · Mingyuan Zhou -
2022 Spotlight: Is a Modular Architecture Enough? »
Sarthak Mittal · Yoshua Bengio · Guillaume Lajoie -
2022 Poster: Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators »
Yuhan Helena Liu · Stephen Smith · Stefan Mihalas · Eric Shea-Brown · Uygar Sümbül -
2022 Poster: $\alpha$-ReQ : Assessing Representation Quality in Self-Supervised Learning by measuring eigenspectrum decay »
Kumar K Agrawal · Arnab Kumar Mondal · Arna Ghosh · Blake Richards -
2022 Poster: Learning dynamics of deep linear networks with multiple pathways »
Jianghong Shi · Eric Shea-Brown · Michael Buice -
2022 Poster: Is a Modular Architecture Enough? »
Sarthak Mittal · Yoshua Bengio · Guillaume Lajoie -
2021 Poster: Adversarial Feature Desensitization »
Pouya Bashivan · Reza Bayat · Adam Ibrahim · Kartik Ahuja · Mojtaba Faramarzi · Touraj Laleh · Blake Richards · Irina Rish -
2021 Poster: The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning »
Shahab Bakhtiari · Patrick Mineault · Timothy Lillicrap · Christopher Pack · Blake Richards -
2021 Poster: Gradient Starvation: A Learning Proclivity in Neural Networks »
Mohammad Pezeshki · Oumar Kaba · Yoshua Bengio · Aaron Courville · Doina Precup · Guillaume Lajoie -
2021 Poster: Your head is there to move you around: Goal-driven models of the primate dorsal pathway »
Patrick Mineault · Shahab Bakhtiari · Blake Richards · Christopher Pack -
2020 : Closing remarks »
Raymond Chua · Feryal Behbahani · Julie J Lee · Rui Ponte Costa · Doina Precup · Blake Richards · Ida Momennejad -
2020 Workshop: Biological and Artificial Reinforcement Learning »
Raymond Chua · Feryal Behbahani · Julie J Lee · Sara Zannone · Rui Ponte Costa · Blake Richards · Ida Momennejad · Doina Precup -
2020 : Organizers Opening Remarks »
Raymond Chua · Feryal Behbahani · Julie J Lee · Ida Momennejad · Rui Ponte Costa · Blake Richards · Doina Precup -
2020 Poster: Untangling tradeoffs between recurrence and self-attention in artificial neural networks »
Giancarlo Kerg · Bhargav Kanuparthi · Anirudh Goyal · Kyle Goyette · Yoshua Bengio · Guillaume Lajoie -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2019 : Opening Remarks »
Guillaume Lajoie · Jessica Thompson · Maximilian Puelma Touzel · Eli Shlizerman · Konrad Kording -
2019 Workshop: Real Neurons & Hidden Units: future directions at the intersection of neuroscience and AI »
Guillaume Lajoie · Eli Shlizerman · Maximilian Puelma Touzel · Jessica Thompson · Konrad Kording -
2019 Poster: Comparison Against Task Driven Artificial Neural Networks Reveals Functional Organization of Mouse Visual Cortex »
Jianghong Shi · Eric Shea-Brown · Michael Buice -
2019 Poster: Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics »
Giancarlo Kerg · Kyle Goyette · Maximilian Puelma Touzel · Gauthier Gidel · Eugene Vorontsov · Yoshua Bengio · Guillaume Lajoie -
2016 Poster: High resolution neural connectivity from incomplete tracing data using nonnegative spline regression »
Kameron Harris · Stefan Mihalas · Eric Shea-Brown