Timezone: »
Poster
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Valentin De Bortoli · Alain Durmus · Xavier Fontaine · Umut Simsekli
In this paper, we investigate the limiting behavior of a
continuous-time counterpart of the Stochastic Gradient Descent (SGD)
algorithm applied to two-layer overparameterized neural networks, as
the number or neurons (i.e., the size of the hidden layer)
$N \to \plusinfty$. Following a probabilistic approach, we show
`propagation of chaos' for the particle system defined by this
continuous-time dynamics under different scenarios, indicating that
the statistical interaction between the particles asymptotically
vanishes. In particular, we establish quantitative convergence with
respect to $N$ of any particle to a solution of a mean-field
McKean-Vlasov equation in the metric space endowed with the
Wasserstein distance. In comparison to previous works on the
subject, we consider settings in which the sequence of stepsizes in
SGD can potentially depend on the number of neurons and the
iterations. We then identify two regimes under which different
mean-field limits are obtained, one of them corresponding to an
implicitly regularized version of the minimization problem at
hand. We perform various experiments on real datasets to validate
our theoretical results, assessing the existence of these two
regimes on classification problems and illustrating our convergence
results.
Author Information
Valentin De Bortoli (ENS Paris-Saclay)
Alain Durmus (ENS Paris Saclay)
Xavier Fontaine (ENS Paris-Saclay)
Umut Simsekli (Inria/ENS)
More from the Same Authors
-
2021 Spotlight: Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling »
Valentin De Bortoli · James Thornton · Jeremy Heng · Arnaud Doucet -
2021 Spotlight: Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms »
Alexander Camuto · George Deligiannidis · Murat Erdogdu · Mert Gurbuzbalaban · Umut Simsekli · Lingjiong Zhu -
2022 : Spectral Diffusion Processes »
Angus Phillips · Thomas Seror · Michael Hutchinson · Valentin De Bortoli · Arnaud Doucet · Emile Mathieu -
2023 Poster: Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models »
Anant Raj · Umut Simsekli · Alessandro Rudi -
2023 Poster: Approximate Heavy Tails in Offline (Multi-Pass) Stochastic Gradient Descent »
Kruno Lehman · Alain Durmus · Umut Simsekli -
2023 Poster: Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent »
Lingjiong Zhu · Mert Gurbuzbalaban · Anant Raj · Umut Simsekli -
2023 Poster: Learning via Wasserstein-Based High Probability Generalization Bounds »
Paul Viallard · Maxime Haddouche · Umut Simsekli · Benjamin Guedj -
2023 Workshop: Heavy Tails in ML: Structure, Stability, Dynamics »
Mert Gurbuzbalaban · Stefanie Jegelka · Michael Mahoney · Umut Simsekli -
2023 Workshop: NeurIPS 2023 Workshop on Diffusion Models »
Bahjat Kawar · Valentin De Bortoli · Charlotte Bunne · James Thornton · Jiaming Song · Jong Chul Ye · Chenlin Meng -
2022 Workshop: NeurIPS 2022 Workshop on Score-Based Methods »
Yingzhen Li · Yang Song · Valentin De Bortoli · Francois-Xavier Briol · Wenbo Gong · Alexia Jolicoeur-Martineau · Arash Vahdat -
2022 Poster: Can Push-forward Generative Models Fit Multimodal Distributions? »
Antoine Salmona · Valentin De Bortoli · Julie Delon · Agnes Desolneux -
2022 Poster: A Continuous Time Framework for Discrete Denoising Models »
Andrew Campbell · Joe Benton · Valentin De Bortoli · Thomas Rainforth · George Deligiannidis · Arnaud Doucet -
2022 Poster: Riemannian Score-Based Generative Modelling »
Valentin De Bortoli · Emile Mathieu · Michael Hutchinson · James Thornton · Yee Whye Teh · Arnaud Doucet -
2022 Poster: Chaotic Regularization and Heavy-Tailed Limits for Deterministic Gradient Descent »
Soon Hoe Lim · Yijun Wan · Umut Simsekli -
2022 Poster: Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$-Covers »
Sejun Park · Umut Simsekli · Murat Erdogdu -
2022 Poster: Local-Global MCMC kernels: the best of both worlds »
Sergey Samsonov · Evgeny Lagutin · Marylou Gabrié · Alain Durmus · Alexey Naumov · Eric Moulines -
2022 Poster: Wavelet Score-Based Generative Modeling »
Florentin Guth · Simon Coste · Valentin De Bortoli · Stephane Mallat -
2022 Poster: FedPop: A Bayesian Approach for Personalised Federated Learning »
Nikita Kotelevskii · Maxime Vono · Alain Durmus · Eric Moulines -
2021 Poster: Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks »
Melih Barsbey · Milad Sefidgaran · Murat Erdogdu · Gaël Richard · Umut Simsekli -
2021 Poster: Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling »
Valentin De Bortoli · James Thornton · Jeremy Heng · Arnaud Doucet -
2021 Poster: Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks »
Tolga Birdal · Aaron Lou · Leonidas Guibas · Umut Simsekli -
2021 Poster: NEO: Non Equilibrium Sampling on the Orbits of a Deterministic Transform »
Achille Thin · Yazid Janati El Idrissi · Sylvain Le Corff · Charles Ollion · Eric Moulines · Arnaud Doucet · Alain Durmus · Christian X Robert -
2021 Poster: Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance »
Hongjian Wang · Mert Gurbuzbalaban · Lingjiong Zhu · Umut Simsekli · Murat Erdogdu -
2021 Poster: Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections »
Kimia Nadjahi · Alain Durmus · Pierre E Jacob · Roland Badeau · Umut Simsekli -
2021 Poster: Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms »
Alexander Camuto · George Deligiannidis · Murat Erdogdu · Mert Gurbuzbalaban · Umut Simsekli · Lingjiong Zhu -
2021 Poster: Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize »
Alain Durmus · Eric Moulines · Alexey Naumov · Sergey Samsonov · Kevin Scaman · Hoi-To Wai -
2020 Poster: Statistical and Topological Properties of Sliced Probability Divergences »
Kimia Nadjahi · Alain Durmus · Lénaïc Chizat · Soheil Kolouri · Shahin Shahrampour · Umut Simsekli -
2020 Spotlight: Statistical and Topological Properties of Sliced Probability Divergences »
Kimia Nadjahi · Alain Durmus · Lénaïc Chizat · Soheil Kolouri · Shahin Shahrampour · Umut Simsekli -
2020 Poster: Explicit Regularisation in Gaussian Noise Injections »
Alexander Camuto · Matthew Willetts · Umut Simsekli · Stephen J Roberts · Chris C Holmes -
2020 Poster: Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks »
Umut Simsekli · Ozan Sener · George Deligiannidis · Murat Erdogdu -
2020 Spotlight: Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks »
Umut Simsekli · Ozan Sener · George Deligiannidis · Murat Erdogdu -
2019 Poster: Asymptotic Guarantees for Learning Generative Models with the Sliced-Wasserstein Distance »
Kimia Nadjahi · Alain Durmus · Umut Simsekli · Roland Badeau -
2019 Spotlight: Asymptotic Guarantees for Learning Generative Models with the Sliced-Wasserstein Distance »
Kimia Nadjahi · Alain Durmus · Umut Simsekli · Roland Badeau -
2019 Poster: First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise »
Thanh Huy Nguyen · Umut Simsekli · Mert Gurbuzbalaban · Gaël RICHARD -
2019 Poster: Copula-like Variational Inference »
Marcel Hirt · Petros Dellaportas · Alain Durmus -
2019 Poster: Generalized Sliced Wasserstein Distances »
Soheil Kolouri · Kimia Nadjahi · Umut Simsekli · Roland Badeau · Gustavo Rohde -
2018 Poster: The promises and pitfalls of Stochastic Gradient Langevin Dynamics »
Nicolas Brosse · Alain Durmus · Eric Moulines -
2018 Poster: Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC »
Tolga Birdal · Umut Simsekli · Mustafa Onur Eken · Slobodan Ilic -
2017 Poster: Learning the Morphology of Brain Signals Using Alpha-Stable Convolutional Sparse Coding »
Mainak Jas · Tom Dupré la Tour · Umut Simsekli · Alexandre Gramfort -
2016 Poster: Stochastic Gradient Richardson-Romberg Markov Chain Monte Carlo »
Alain Durmus · Umut Simsekli · Eric Moulines · Roland Badeau · Gaël RICHARD -
2011 Poster: Generalised Coupled Tensor Factorisation »
Kenan Y Yılmaz · Taylan Cemgil · Umut Simsekli