Learning to optimize over linearly convergent algorithms: gotta characterize 'em all
Abstract
In high-stakes engineering applications, optimizers must come with provable worst-case guarantees over a mathematically defined class of problems. Yet, designing for the worst case often sacrifices performance on the instances that arise in practice. While learning to optimize approaches hold promise in enhancing average-case optimizer performance on sets of target problems of interest, they often require conservative safe-guarding mechanisms in order to maintain convergent behavior. In this paper, we characterize the class of algorithms that achieve linear convergence for several classes of nonsmooth composite optimization problems. This is achieved by parametrizing all augmentations to a baseline linearly convergent algorithm – that is, all perturbations to its update rule – that preserve its certified rate on the entire class. Our results apply to augmenting algorithms such as gradient descent for nonconvex, gradient-dominated functions; Nesterov’s accelerated method for strongly convex functions; and projected methods for optimization over polyhedral feasibility sets. We showcase effectiveness of the approach on solving optimization problems with tight iteration budgets in application to ill-conditioned systems of linear equations and model predictive control for linear systems.