Timezone: »
Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters. Based on this intuition, we explore the use of exact per-sample Hessian-vector products and gradients to construct optimizers that are self-tuning and hyperparameter-free. Based on a dynamics model of the gradient, we derive a process which leads to a curvature-corrected, noise-adaptive online gradient estimate. The smoothness of our updates makes it more amenable to simple step size selection schemes, which we also base off of our estimates quantities. We prove that our model-based procedure converges in the noisy quadratic setting. Though we do not see similar gains in deep learning tasks, we can match the performance of well-tuned optimizers and ultimately, this is an interesting step for constructing self-tuning optimizers.
Author Information
Tian Qi Chen (U of Toronto)
More from the Same Authors
-
2019 Poster: Latent Ordinary Differential Equations for Irregularly-Sampled Time Series »
Yulia Rubanova · Tian Qi Chen · David Duvenaud -
2019 Poster: Residual Flows for Invertible Generative Modeling »
Tian Qi Chen · Jens Behrmann · David Duvenaud · Joern-Henrik Jacobsen -
2019 Spotlight: Residual Flows for Invertible Generative Modeling »
Tian Qi Chen · Jens Behrmann · David Duvenaud · Joern-Henrik Jacobsen -
2019 Poster: Neural Networks with Cheap Differential Operators »
Tian Qi Chen · David Duvenaud -
2019 Spotlight: Neural Networks with Cheap Differential Operators »
Tian Qi Chen · David Duvenaud -
2018 Poster: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud -
2018 Oral: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud -
2018 Poster: Neural Ordinary Differential Equations »
Tian Qi Chen · Yulia Rubanova · Jesse Bettencourt · David Duvenaud -
2018 Oral: Neural Ordinary Differential Equations »
Tian Qi Chen · Yulia Rubanova · Jesse Bettencourt · David Duvenaud