In this work we propose a differential geometric motivation for Nesterov's accelerated gradient method (AGM) for strongly-convex problems. By considering the optimization procedure as occurring on a Riemannian manifold with a natural structure, The AGM method can be seen as the proximal point method applied in this curved space. This viewpoint can also be extended to the continuous time case, where the accelerated gradient method arises from the natural block-implicit Euler discretization of an ODE on the manifold. We provide an analysis of the convergence rate of this ODE for quadratic objectives.
Aaron Defazio (Facebook AI Research)
More from the Same Authors
2019 Poster: On the Ineffectiveness of Variance Reduced Optimization for Deep Learning »
Aaron Defazio · Leon Bottou