Skip to yearly menu bar Skip to main content

Workshop: OPT 2021: Optimization for Machine Learning

Faster Quasi-Newton Methods for Linear Composition Problems

Betty Shea · Mark Schmidt


Despite its lack of theoretical guarantees, the limited memory version of BFGS (l-BFGS) is widely used in practice for large-scale optimization problems. In particular, when combined with the Wolfe conditions, l-BFGS tends to find solutions significantly faster than methods with known convergence rates such as gradient descent (GD). Similarly, inexact stepsize searches have outsized practical impact despite, in theory, having the same worst-case complexity as using a constant step size. These search techniques are often used for linear composition problems which are known to allow efficient linesearches and subspace optimization (SO). In this paper, we propose a version of l-BFGS that is faster for linear composition problems. Our method combines a l-BFGS direction with a momentum direction using SO. We explore practical SO issues that include extending the Wolfe conditions from one- to multi-dimension. We experimentally compare our method to standard l-BFGS and to a SO method that is known to be efficient.

Chat is not available.