Timezone: »

Shadowing Properties of Optimization Algorithms
Antonio Orvieto · Aurelien Lucchi

Thu Dec 12 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #217

Ordinary differential equation (ODE) models of gradient-based optimization methods can provide insights into the dynamics of learning and inspire the design of new algorithms. Unfortunately, this thought-provoking perspective is weakened by the fact that, in the worst case, the error between the algorithm steps and its ODE approximation grows exponentially with the number of iterations. In an attempt to encourage the use of continuous-time methods in optimization, we show that, if some additional regularity on the objective is assumed, the ODE representations of Gradient Descent and Heavy-ball do not suffer from the aforementioned problem, once we allow for a small perturbation on the algorithm initial condition. In the dynamical systems literature, this phenomenon is called shadowing. Our analysis relies on the concept of hyperbolicity, as well as on tools from numerical analysis.

Author Information

Antonio Orvieto (ETH Zurich)

Phd Student at ETH Zurich. I’m interested in the design and the analysis of adaptive stochastic momentum optimization algorithms for non-convex machine learning problems. Publications: 2Neurips, 1ICML, 1AISTATS, 1UAI. 1 Patent on a learning algorithm for an impact screwdriver. Besides my PhD research, I am involved in several computational systems biology projects at ETH Zurich, such as SignalX. I also work as a biomedical data analyst (genetic rare diseases research) at the University of Padua.

Aurelien Lucchi (ETH Zurich)

More from the Same Authors