Poster

Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems

Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong

Keywords: Optimization for Deep Networks Deep Learning Learning Theory Optimization -> Non-Convex Optimization; Theory

2020 Poster

Paper PDF [ Paper ]

Abstract

This paper proposes two efficient algorithms for computing approximate second-order stationary points (SOSPs) of problems with generic smooth non-convex objective functions and generic linear constraints. While finding (approximate) SOSPs for the class of smooth non-convex linearly constrained problems is computationally intractable, we show that generic problem instances in this class can be solved efficiently. Specifically, for a generic problem instance, we show that certain strict complementarity (SC) condition holds for all Karush-Kuhn-Tucker (KKT) solutions. Based on this condition, we design an algorithm named Successive Negative-curvature grAdient Projection (SNAP), which performs either conventional gradient projection or some negative curvature-based projection steps to find SOSPs. SNAP is a second-order algorithm that requires $\widetilde{\mathcal{O}}(\max\{1/\epsilon^2_G,1/\epsilon^3_H\})$ iterations to compute an $(\epsilon_G,\epsilon_H)$-SOSP, where $\widetilde{\mathcal{O}}$ hides the iteration complexity for eigenvalue-decomposition. Building on SNAP, we propose a first-order algorithm, named SNAP$^+$, that requires $\mathcal{O}(1/\epsilon^{2.5})$ iterations to compute $(\epsilon, \sqrt{\epsilon})$-SOSP. The per-iteration computational complexities of our algorithms are polynomial in the number of constraints and problem dimension. To the best of our knowledge, this is the first time that first-order algorithms with polynomial per-iteration complexity and global sublinear rate are designed to find SOSPs of the important class of non-convex problems with linear constraints (almost surely).

Video

Chat is not available.