Timezone: »
We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.
Author Information
Tim Salimans (Algoritmica)
Diederik Kingma (Google)
More from the Same Authors
-
2022 : On Distillation of Guided Diffusion Models »
Chenlin Meng · Ruiqi Gao · Diederik Kingma · Stefano Ermon · Jonathan Ho · Tim Salimans -
2023 Poster: Understanding Diffusion Objectives as the ELBO with Data Augmentation »
Diederik Kingma · Ruiqi Gao -
2023 Oral: Understanding Diffusion Objectives as the ELBO with Data Augmentation »
Diederik Kingma · Ruiqi Gao -
2021 Poster: Variational Diffusion Models »
Diederik Kingma · Tim Salimans · Ben Poole · Jonathan Ho -
2020 Poster: ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA »
Ilyes Khemakhem · Ricardo Monti · Diederik Kingma · Aapo Hyvarinen -
2020 Spotlight: ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA »
Ilyes Khemakhem · Ricardo Monti · Diederik Kingma · Aapo Hyvarinen -
2018 Poster: Glow: Generative Flow with Invertible 1x1 Convolutions »
Diederik Kingma · Prafulla Dhariwal -
2017 : Evolutionary Strategies »
Tim Salimans -
2017 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Andrew Wilson · Diederik Kingma · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2016 Oral: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks »
Tim Salimans · Diederik Kingma -
2016 Poster: Improving Variational Autoencoders with Inverse Autoregressive Flow »
Diederik Kingma · Tim Salimans · Rafal Jozefowicz · Peter Chen · Xi Chen · Ilya Sutskever · Max Welling -
2016 Poster: Improved Techniques for Training GANs »
Tim Salimans · Ian Goodfellow · Wojciech Zaremba · Vicki Cheung · Alec Radford · Peter Chen · Xi Chen -
2015 : Variational Auto-Encoders and Extensions »
Diederik Kingma -
2015 Poster: Variational Dropout and the Local Reparameterization Trick »
Diederik Kingma · Tim Salimans · Max Welling -
2014 Workshop: High-energy particle physics, machine learning, and the HiggsML data challenge (HEPML) »
Glen Cowan · Balázs Kégl · Kyle Cranmer · Gábor Melis · Tim Salimans · Vladimir Vava Gligorov · Daniel Whiteson · Lester Mackey · Wojciech Kotlowski · Roberto Díaz Morales · Pierre Baldi · Cecile Germain · David Rousseau · Isabelle Guyon · Tianqi Chen -
2014 Poster: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2014 Spotlight: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2010 Poster: Regularized estimation of image statistics by Score Matching »
Diederik Kingma · Yann LeCun