Timezone: »
Poster
Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations
Kevin Scaman · Cedric Malherbe
This work proposes a novel analysis of stochastic gradient descent (SGD) for non-convex and smooth optimization. Our analysis sheds light on the impact of the probability distribution of the gradient noise on the convergence rate of the norm of the gradient. In the case of sub-Gaussian and centered noise, we prove that, with probability $1-\delta$, the number of iterations to reach a precision $\varepsilon$ for the squared gradient norm is $O(\varepsilon^{-2}\ln(1/\delta))$. In the case of centered and integrable heavy-tailed noise, we show that, while the expectation of the iterates may be infinite, the squared gradient norm still converges with probability $1-\delta$ in $O(\varepsilon^{-p}\delta^{-q})$ iterations, where $p,q > 2$. This result shows that heavy-tailed noise on the gradient slows down the convergence of SGD without preventing it, proving that SGD is robust to gradient noise with unbounded variance, a setting of interest for Deep Learning. In addition, it indicates that choosing a step size proportional to $T^{-1/b}$ where $b$ is the tail-parameter of the noise and $T$ is the number of iterations leads to the best convergence rates. Both results are simple corollaries of a unified analysis using the novel concept of biased expectations, a simple and intuitive mathematical tool to obtain concentration inequalities. Using this concept, we propose a new quantity to measure the amount of noise added to the gradient, and discuss its value in multiple scenarios.
Author Information
Kevin Scaman (Huawei Noah's Ark Lab)
Cedric Malherbe (Huawei Noah's Ark Lab)
More from the Same Authors
-
2022 : Meta-learning of Black-box Solvers Using Deep Reinforcement Learning »
Cedric Malherbe · Aladin Virmaux · Ludovic Dos Santos · Sofian Chaybouti -
2022 Spotlight: Optimistic Tree Searches for Combinatorial Black-Box Optimization »
Cedric Malherbe · Antoine Grosnit · Rasul Tutunov · Haitham Bou Ammar · Jun Wang -
2022 Poster: Optimistic Tree Searches for Combinatorial Black-Box Optimization »
Cedric Malherbe · Antoine Grosnit · Rasul Tutunov · Haitham Bou Ammar · Jun Wang -
2020 Poster: A Simple and Efficient Smoothing Method for Faster Optimization and Local Exploration »
Kevin Scaman · Ludovic DOS SANTOS · Merwan Barlier · Igor Colin -
2018 Poster: KONG: Kernels for ordered-neighborhood graphs »
Moez Draief · Konstantin Kutzkov · Kevin Scaman · Milan Vojnovic -
2018 Spotlight: KONG: Kernels for ordered-neighborhood graphs »
Moez Draief · Konstantin Kutzkov · Kevin Scaman · Milan Vojnovic