Timezone: »
Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlations among the parameters, it is of interest to analyze how these fluctuations evolve. In this work, we derive a dynamical CLT to prove that the asymptotic fluctuations around the mean limit remain bounded in mean square throughout training. The upper bound is given by a Monte-Carlo resampling error, with a variance that depends on the 2-norm of the underlying measure, which also controls the generalization error. This motivates the use of this 2-norm as a regularization term during training. Furthermore, if the mean-field dynamics converges to a measure that interpolates the training data, we prove that the asymptotic deviation eventually vanishes in the CLT scaling. We also complement these results with numerical experiments.
Author Information
Zhengdao Chen (New York University)
Grant Rotskoff (Stanford University)
Joan Bruna (NYU)
Eric Vanden-Eijnden (New York University)
More from the Same Authors
-
2021 : An Extensible Benchmark Suite for Learning to Simulate Physical Systems »
Karl Otness · Arvi Gjoka · Joan Bruna · Daniele Panozzo · Benjamin Peherstorfer · Teseo Schneider · Denis Zorin -
2021 Spotlight: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2021 : Quantile Filtered Imitation Learning »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2022 Poster: Learning Optimal Flows for Non-Equilibrium Importance Sampling »
Yu Cao · Eric Vanden-Eijnden -
2023 : Sharp analysis of learning a flow-based generative model from limited sample complexity »
Hugo Cui · Eric Vanden-Eijnden · Florent Krzakala · Lenka Zdeborová -
2023 Poster: A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks »
Vignesh Kothapalli · Tom Tirer · Joan Bruna -
2023 Poster: Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation »
David Brandfonbrener · Ofir Nachum · Joan Bruna -
2023 Poster: Efficient Training of Energy-Based Models Using Jarzynski Equality »
Davide Carbone · Mengjian Hua · Simon Coste · Eric Vanden-Eijnden -
2023 Poster: On Single-Index Models beyond Gaussian Data »
Aaron Zweig · Loucas PILLAUD-VIVIEN · Joan Bruna -
2022 Poster: Exponential Separations in Symmetric Neural Networks »
Aaron Zweig · Joan Bruna -
2022 Poster: When does return-conditioned supervised learning work for offline reinforcement learning? »
David Brandfonbrener · Alberto Bietti · Jacob Buckman · Romain Laroche · Joan Bruna -
2022 Poster: Learning sparse features can lead to overfitting in neural networks »
Leonardo Petrini · Francesco Cagnetta · Eric Vanden-Eijnden · Matthieu Wyart -
2022 Poster: On Non-Linear operators for Geometric Deep Learning »
Grégoire Sergeant-Perthuis · Jakob Maier · Joan Bruna · Edouard Oyallon -
2022 Poster: Learning single-index models with shallow neural networks »
Alberto Bietti · Joan Bruna · Clayton Sanford · Min Jae Song -
2021 Poster: On the Sample Complexity of Learning under Geometric Stability »
Alberto Bietti · Luca Venturi · Joan Bruna -
2021 Poster: On the Cryptographic Hardness of Learning Single Periodic Neurons »
Min Jae Song · Ilias Zadik · Joan Bruna -
2021 Poster: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2020 Poster: A mean-field analysis of two-player zero-sum games »
Carles Domingo-Enrich · Samy Jelassi · Arthur Mensch · Grant Rotskoff · Joan Bruna -
2020 Poster: Can Graph Neural Networks Count Substructures? »
Zhengdao Chen · Lei Chen · Soledad Villar · Joan Bruna -
2020 Poster: Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions »
Stefano Sarao Mannelli · Eric Vanden-Eijnden · Lenka Zdeborová -
2020 Session: Orals & Spotlights Track 26: Graph/Relational/Theory »
Joan Bruna · Cassio de Campos -
2020 Poster: IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method »
Yossi Arjevani · Joan Bruna · Bugra Can · Mert Gurbuzbalaban · Stefanie Jegelka · Hongzhou Lin -
2020 Spotlight: IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method »
Yossi Arjevani · Joan Bruna · Bugra Can · Mert Gurbuzbalaban · Stefanie Jegelka · Hongzhou Lin -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 : Surya Ganguli, Yasaman Bahri, Florent Krzakala moderated by Lenka Zdeborova »
Florent Krzakala · Yasaman Bahri · Surya Ganguli · Lenka Zdeborová · Adji Bousso Dieng · Joan Bruna -
2019 : Poster Spotlight 1 »
David Brandfonbrener · Joan Bruna · Tom Zahavy · Haim Kaplan · Yishay Mansour · Nikos Karampatziakis · John Langford · Paul Mineiro · Donghwan Lee · Niao He -
2019 : Opening Remarks »
Reinhard Heckel · Paul Hand · Alex Dimakis · Joan Bruna · Deanna Needell · Richard Baraniuk -
2019 Workshop: Solving inverse problems with deep networks: New architectures, theoretical foundations, and applications »
Reinhard Heckel · Paul Hand · Richard Baraniuk · Joan Bruna · Alex Dimakis · Deanna Needell -
2019 Poster: Gradient Dynamics of Shallow Univariate ReLU Networks »
Francis Williams · Matthew Trager · Daniele Panozzo · Claudio Silva · Denis Zorin · Joan Bruna -
2019 Poster: On the Expressive Power of Deep Polynomial Neural Networks »
Joe Kileel · Matthew Trager · Joan Bruna -
2019 Poster: Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias »
Stéphane d'Ascoli · Levent Sagun · Giulio Biroli · Joan Bruna -
2019 Poster: On the equivalence between graph isomorphism testing and function approximation with GNNs »
Zhengdao Chen · Soledad Villar · Lei Chen · Joan Bruna -
2019 Poster: Stability of Graph Scattering Transforms »
Fernando Gama · Alejandro Ribeiro · Joan Bruna -
2018 : Invited Talk 3 »
Joan Bruna -
2018 : Joan Bruna »
Joan Bruna -
2018 Poster: Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks »
Grant Rotskoff · Eric Vanden-Eijnden -
2017 Tutorial: Geometric Deep Learning on Graphs and Manifolds »
Michael Bronstein · Joan Bruna · arthur szlam · Xavier Bresson · Yann LeCun -
2014 Poster: Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation »
Emily Denton · Wojciech Zaremba · Joan Bruna · Yann LeCun · Rob Fergus