Skip to yearly menu bar Skip to main content

Workshop: Heavy Tails in ML: Structure, Stability, Dynamics

High Probability Guarantees for Random Reshuffling

Hengxu Yu · Xiao Li

Keywords: [ the last iteration result ] [ Random Reshuffling ] [ shuffled SGD ] [ stopping criterion ] [ high-probability sample complexity ]


We study the probabilistic behaviors of stochastic gradient descent with random reshuffling (RR) on nonconvex problems. We prove that the same complexity (except for a logrithmic term) as that of in expectation case also holds with high probability, which characterizes the performance of RR for a single run instead of averaging infinitely many realizations. Our analysis does not impose any additional assumptions on the stochastic gradient errors, which admits heavy tails. This is in contrast to high probabiltiy analyses of SGD that rely on sub-Gaussian stochastic gradient errors or tricks like clipping, momentum, etc. Furthermore, leveraging the established high probability error bounds, we propose a simple stopping criterion for RR that introduces few computational costs. We prove that the function value strictly decreases with a high probability before the stopping criterion is triggered, ensuring that the criterion will indeed be activated. Finally, a "last iterate'' result is built for the iteration returned with this stopping criterion. We believe that our new developments for RR serve as a stepping stone towards enabling more refined high probability analyses for characterizing its performance.

Chat is not available.