Timezone: »

 
Spotlight
Gradient Descent Can Take Exponential Time to Escape Saddle Points
Simon Du · Chi Jin · Jason D Lee · Michael Jordan · Aarti Singh · Barnabas Poczos

Tue Dec 05 11:25 AM -- 11:30 AM (PST) @ Hall C

Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, and can take exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, Jin et al., 2017] is not slowed down by saddle points—it can find an approximate local minimizer in polynomial time. This result concludes that gradient descent is inherently slower, and justifies the importance of adding perturbations for efficient non-convex optimization. Experiments are also provided to demonstrate our theoretical findings.

Author Information

Simon Du (Carnegie Mellon University)
Chi Jin (UC Berkeley)
Jason D Lee (USC)
Michael Jordan (UC Berkeley)
Aarti Singh (CMU)
Barnabas Poczos (Carnegie Mellon University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors