Timezone: »

Charting Flat Minima Using the Conserved Quantities of Gradient Flow
Bo Zhao · Iordan Ganev · Robin Walters · Rose Yu · Nima Dehmamy
Event URL: https://openreview.net/forum?id=JvcLG3eek70 »

Empirical studies have revealed that many minima in the loss landscape of deep learning are connected and reside on a low-loss valley. Yet, little is known about the theoretical origin of these low-loss valleys. Ensemble models sampling different parts of a low-loss valley have reached state-of-the-art performance. However, we lack theoretical ways to measure what portions of low-loss valleys are being explored during training. We address these two aspects of low-loss valleys using symmetries and conserved quantities. We show that continuous symmetries in the parameter space of neural networks can give rise to low- loss valleys. We then show that conserved quantities associated with these symmetries can be used to define coordinates along low-loss valleys. These conserved quantities reveal that gradient flow only explores a small part of a low-loss valley. We use conserved quantities to explore other parts of the loss valley by proposing alternative initialization schemes.

Author Information

Bo Zhao (University of California, San Diego)
Iordan Ganev (Radboud University)
Robin Walters (Northeastern University)
Rose Yu (UC San Diego)
Nima Dehmamy (IBM Research)

I obtained my PhD in physics on complex systems from Boston University in 2016. I did postdoc at Northeastern University working on 3D embedded graphs and graph neural networks. My current research is on physics-informed machine learning and computational social science.

More from the Same Authors