Timezone: »

Sharp Analysis of Stochastic Optimization under Global Kurdyka-Lojasiewicz Inequality
Ilyas Fatkhullin · Jalal Etesami · Niao He · Negar Kiyavash

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #841
We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-{\L}ojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness assumption. We first introduce a general framework to analyze Stochastic Gradient Descent (SGD) and its associated nonlinear dynamics under the setting. As a byproduct of our analysis, we obtain a sample complexity of $\mathcal{O}(\epsilon^{-(4-\alpha)/\alpha})$ for SGD when the objective satisfies the so called $\alpha$-P{\L} condition, where $\alpha$ is the degree of gradient domination. Furthermore, we show that a modified SGD with variance reduction and restarting (PAGER) achieves an improved sample complexity of $\mathcal{O}(\epsilon^{-2/\alpha})$ when the objective satisfies the average smoothness assumption. This leads to the first optimal algorithm for the important case of $\alpha=1$ which appears in applications such as policy optimization in reinforcement learning.

Author Information

Ilyas Fatkhullin (ETH Zurich)
Jalal Etesami (Technische Universität München)
Niao He (ETH Zurich)
Negar Kiyavash (École Polytechnique Fédérale de Lausanne)

More from the Same Authors