NIPS Poster Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Poster

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Mingrui Liu · Zhe Li · Xiaoyu Wang · Jinfeng Yi · Tianbao Yang

Room 210 #81

Keywords: [ Optimization ] [ Non-Convex Optimization ]

[ Abstract ]

Abstract: Negative curvature descent (NCD) method has been utilized to design deterministic or stochastic algorithms for non-convex optimization aiming at finding second-order stationary points or local minima. In existing studies, NCD needs to approximate the smallest eigen-value of the Hessian matrix with a sufficient precision (e.g.,

ϵ_{2} ≪ 1

$\epsilon_2\ll 1$ ) in order to achieve a sufficiently accurate second-order stationary solution (i.e.,

λ_{min} (\nabla^{2} f (\x)) \geq - ϵ_{2})

$\lambda_{\min}(\nabla^2 f(\x))\geq -\epsilon_2)$ . One issue with this approach is that the target precision

ϵ_{2}

$\epsilon_2$ is usually set to be very small in order to find a high quality solution, which increases the complexity for computing a negative curvature. To address this issue, we propose an adaptive NCD to allow for an adaptive error dependent on the current gradient's magnitude in approximating the smallest eigen-value of the Hessian, and to encourage competition between a noisy NCD step and gradient descent step. We consider the applications of the proposed adaptive NCD for both deterministic and stochastic non-convex optimization, and demonstrate that it can help reduce the the overall complexity in computing the negative curvatures during the course of optimization without sacrificing the iteration complexity.

Live content is unavailable. Log in and register to view live content