NeurIPS Poster Complexity of Highly Parallel Non-Smooth Convex Optimization

Poster

Complexity of Highly Parallel Non-Smooth Convex Optimization

Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford

East Exhibition Hall B, C #107

Keywords: [ Convex Optimization ] [ Optimization ]

[ Abstract ]

Abstract: A landmark result of non-smooth convex optimization is that gradient descent is an optimal algorithm whenever the number of computed gradients is smaller than the dimension

d

$d$ . In this paper we study the extension of this result to the parallel optimization setting. Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer

p o l y (d)

$\mathrm{poly}(d)$ gradient queries in parallel. We show that in this case gradient descent is optimal only up to

\tilde{O} (\sqrt{d})

$\tilde{O}(\sqrt{d})$ rounds of interactions with the oracle. The lower bound improves upon a decades old construction by Nemirovski which proves optimality only up to

d^{1 / 3}

$d^{1/3}$ rounds (as recently observed by Balkanski and Singer), and the suboptimality of gradient descent after

\sqrt{d}

$\sqrt{d}$ rounds was already observed by Duchi, Bartlett and Wainwright. In the latter regime we propose a new method with improved complexity, which we conjecture to be optimal. The analysis of this new method is based upon a generalized version of the recent results on optimal acceleration for highly smooth convex optimization.

Live content is unavailable. Log in and register to view live content