Timezone: »

High Probability Step Size Lower Bound for Adaptive Stochastic Optimization
Katya Scheinberg · Miaolan Xie

Several classical adaptive optimization algorithms, such as line search and trust-region methods, have been recently extended to stochastic settings. Unlike the stochastic gradient method and its many variants, these algorithms do not use a pre-specified sequence of step sizes, but increase or decrease the step size adaptively according to the estimated progress of the algorithm. These algorithms rely on stochastic oracles that estimate function values, gradients, and Hessians in some cases. The accuracy requirement of these oracles is also adaptive and depends on the step size. In the deterministic setting, a lower bound on the step size is easily derived, however, in the stochastic setting, due to possible oracle failures, bounds on the step size have not been previously derived. In this paper, we give a lower bound on the step size that holds with high probability. This bound is dependent on the probability of the oracle failures, recovering the deterministic result as an extreme case when this probability is zero.