Timezone: »
Spherical Perspective on Learning with Normalization Layers
Simon Roburin · Yann de Mont-Marin · Andrei Bursuc · Renaud Marlet · Patrick Pérez · Mathieu Aubry
Normalization Layers (NL) are widely used in modern deep-learning architectures. Despite their apparent simplicity, their effect on optimization is not yet fully understood. We introduce a spherical framework to study the optimization of neural networks with NL from a geometric perspective. Concretely, we leverage the radial invariance of groups of parameters to translate the optimization steps on the $L_2$ unit hypersphere. This formulation and the associated geometric interpretation shed new light on the training dynamics. We use it to derive the first effective learning rate expression of Adam. We then show theoretically and empirically
that, in the presence of NL, performing SGD alone is actually equivalent to a variant of Adam constrained to the unit hypersphere.
Author Information
Simon Roburin (ENPC; valeo.ai)
Yann de Mont-Marin (Inria)
Andrei Bursuc (valeo.ai)
Renaud Marlet (Valeo)
Patrick Pérez (Valeo.ai)
Mathieu Aubry (École des ponts ParisTech)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : Spherical Perspective on Learning with Normalization Layers »
Dates n/a. Room
More from the Same Authors
-
2020 : Paper 16: Driving Behavior Explanation with Multi-level Fusion »
Matthieu Cord · Patrick Pérez -
2022 : Improving Predictive Performance and Calibration by Weight Fusion in Semantic Segmentation »
Timo Saemann · Ahmed Hammam · Andrei Bursuc · Christoph Stiller · Horst-Michael Gross -
2022 : Multi-Modal 3D GAN for Urban Scenes »
Loïck Chambon · Mickael Chen · Tuan-Hung VU · Alexandre Boulch · Andrei Bursuc · Matthieu Cord · Patrick Pérez -
2022 : Instance-Aware Observer Network for Out-of-Distribution Object Segmentation »
Victor Besnier · Andrei Bursuc · Alexandre Briot · David Picard -
2021 : Poster Session 1 (gather.town) »
Hamed Jalali · Robert Hönig · Maximus Mutschler · Manuel Madeira · Abdurakhmon Sadiev · Egor Shulgin · Alasdair Paren · Pascal Esser · Simon Roburin · Julius Kunze · Agnieszka Słowik · Frederik Benzing · Futong Liu · Hongyi Li · Ryotaro Mitsuboshi · Grigory Malinovsky · Jayadev Naram · Zhize Li · Igor Sokolov · Sharan Vaswani -
2021 : Contributed Talks in Session 1 (Zoom) »
Sebastian Stich · Futong Liu · Abdurakhmon Sadiev · Frederik Benzing · Simon Roburin -
2021 Poster: Large-Scale Unsupervised Object Discovery »
Van Huy Vo · Elena Sizikova · Cordelia Schmid · Patrick Pérez · Jean Ponce -
2021 Poster: Re-ranking for image retrieval and transductive few-shot classification »
Xi SHEN · Yang Xiao · Shell Xu Hu · Othman Sbai · Mathieu Aubry -
2020 : Q&A: Patrick Perez »
Patrick Pérez -
2020 : Invited Talk: Patrick Perez »
Patrick Pérez -
2020 Poster: Deep Transformation-Invariant Clustering »
Tom Monnier · Thibault Groueix · Mathieu Aubry -
2020 Oral: Deep Transformation-Invariant Clustering »
Tom Monnier · Thibault Groueix · Mathieu Aubry -
2019 Poster: Learning elementary structures for 3D shape generation and matching »
Theo Deprelle · Thibault Groueix · Matthew Fisher · Vladimir Kim · Bryan Russell · Mathieu Aubry -
2019 Poster: Zero-Shot Semantic Segmentation »
Maxime Bucher · Tuan-Hung VU · Matthieu Cord · Patrick Pérez -
2019 Poster: Addressing Failure Prediction by Learning Model Confidence »
Charles Corbière · Nicolas THOME · Avner Bar-Hen · Matthieu Cord · Patrick Pérez