Timezone: »
Overparameterization is a key factor in the absence of convexity to explain global convergence of gradient descent (GD) for neural networks. Beside the well studied lazy regime, infinite width (mean field) analysis has been developed for shallow networks, using on convex optimization technics. To bridge the gap between the lazy and mean field regimes, we study Residual Networks (ResNets) in which the residual block has linear parameterization while still being nonlinear. Such ResNets admit both infinite depth and width limits, encoding residual blocks in a Reproducing Kernel Hilbert Space (RKHS). In this limit, we prove a local Polyak-Lojasiewicz inequality. Thus, every critical point is a global minimizer and a local convergence result of GD holds, retrieving the lazy regime. In contrast with other mean-field studies, it applies to both parametric and non-parametric cases under an expressivity condition on the residuals. Our analysis leads to a practical and quantified recipe: starting from a universal RKHS, Random Fourier Features are applied to obtain a finite dimensional parameterization satisfying with high-probability our expressivity condition.
Author Information
Raphaël Barboni (CNRS, projet NORIA, ENS - PSL)
Gabriel Peyré (CNRS and ENS)
Francois-Xavier Vialard (University Gustave Eiffel)
More from the Same Authors
-
2021 : Faster Unbalanced Optimal Transport: Translation invariant Sinkhorn and 1-D Frank-Wolfe »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2021 : Faster Unbalanced Optimal Transport: Translation invariant Sinkhorn and 1-D Frank-Wolfe »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2022 Poster: Parameter tuning and model selection in Optimal Transport with semi-dual Brenier formulation »
Adrien Vacher · Francois-Xavier Vialard -
2023 Poster: Abide by the law and follow the flow: conservation laws for gradient flows »
Sibylle Marcotte · Remi Gribonval · Gabriel Peyré -
2023 Oral: Abide by the law and follow the flow: conservation laws for gradient flows »
Sibylle Marcotte · Remi Gribonval · Gabriel Peyré -
2022 Poster: Do Residual Neural Networks discretize Neural Ordinary Differential Equations? »
Michael Sander · Pierre Ablin · Gabriel Peyré -
2021 Workshop: Optimal Transport and Machine Learning »
Jason Altschuler · Charlotte Bunne · Laetitia Chapel · Marco Cuturi · Rémi Flamary · Gabriel Peyré · Alexandra Suvorikova -
2021 Poster: The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation »
Thibault Sejourne · Francois-Xavier Vialard · Gabriel Peyré -
2020 Poster: Faster Wasserstein Distance Estimation with the Sinkhorn Divergence »
Lénaïc Chizat · Pierre Roussillon · Flavien Léger · François-Xavier Vialard · Gabriel Peyré -
2020 Poster: Online Sinkhorn: Optimal Transport distances from sample streams »
Arthur Mensch · Gabriel Peyré -
2020 Oral: Online Sinkhorn: Optimal Transport distances from sample streams »
Arthur Mensch · Gabriel Peyré -
2020 Poster: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2020 Oral: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2019 Workshop: Optimal Transport for Machine Learning »
Marco Cuturi · Gabriel Peyré · Rémi Flamary · Alexandra Suvorikova -
2019 Poster: Region-specific Diffeomorphic Metric Mapping »
Zhengyang Shen · Francois-Xavier Vialard · Marc Niethammer -
2019 Poster: Universal Invariant and Equivariant Graph Neural Networks »
Nicolas Keriven · Gabriel Peyré