Timezone: »
Poster
Complexity of Highly Parallel Non-Smooth Convex Optimization
Sebastien Bubeck · Qijia Jiang · Yin-Tat Lee · Yuanzhi Li · Aaron Sidford
Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C #107
A landmark result of non-smooth convex optimization is that gradient descent is an optimal algorithm whenever the number of computed gradients is smaller than the dimension $d$. In this paper we study the extension of this result to the parallel optimization setting. Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer $\mathrm{poly}(d)$ gradient queries in parallel. We show that in this case gradient descent is optimal only up to $\tilde{O}(\sqrt{d})$ rounds of interactions with the oracle. The lower bound improves upon a decades old construction by Nemirovski which proves optimality only up to $d^{1/3}$ rounds (as recently observed by Balkanski and Singer), and the suboptimality of gradient descent after $\sqrt{d}$ rounds was already observed by Duchi, Bartlett and Wainwright. In the latter regime we propose a new method with improved complexity, which we conjecture to be optimal. The analysis of this new method is based upon a generalized version of the recent results on optimal acceleration for highly smooth convex optimization.
Author Information
Sebastien Bubeck (Microsoft Research)
Qijia Jiang (Stanford University)
Yin-Tat Lee
Yuanzhi Li (Princeton)
Aaron Sidford (Stanford)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Spotlight: Complexity of Highly Parallel Non-Smooth Convex Optimization »
Wed Dec 11th 12:45 -- 12:50 AM Room West Exhibition Hall A
More from the Same Authors
-
2020 Poster: Instance Based Approximations to Profile Maximum Likelihood »
Nima Anari · Moses Charikar · Kirankumar Shiragur · Aaron Sidford -
2020 Poster: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Poster: Large-Scale Methods for Distributionally Robust Optimization »
Daniel Levy · Yair Carmon · John Duchi · Aaron Sidford -
2020 Oral: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Poster: Network size and size of the weights in memorization with two-layers neural networks »
Sebastien Bubeck · Ronen Eldan · Yin Tat Lee · Dan Mikulincer -
2019 Poster: A General Framework for Symmetric Property Estimation »
Moses Charikar · Kirankumar Shiragur · Aaron Sidford -
2019 Poster: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: On the Convergence Rate of Training Recurrent Neural Networks »
Zeyuan Allen-Zhu · Yuanzhi Li · Zhao Song -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Oral: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: What Can ResNet Learn Efficiently, Going Beyond Kernels? »
Zeyuan Allen-Zhu · Yuanzhi Li -
2019 Poster: Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG »
Yujia Jin · Aaron Sidford -
2019 Spotlight: Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG »
Yujia Jin · Aaron Sidford -
2019 Poster: Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers »
Zeyuan Allen-Zhu · Yuanzhi Li · Yingyu Liang -
2019 Poster: Can SGD Learn Recurrent Neural Networks with Provable Generalization? »
Zeyuan Allen-Zhu · Yuanzhi Li -
2019 Poster: A Direct tilde{O}(1/epsilon) Iteration Parallel Algorithm for Optimal Transport »
Arun Jambulapati · Aaron Sidford · Kevin Tian -
2019 Poster: Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks »
Yuanzhi Li · Colin Wei · Tengyu Ma -
2019 Spotlight: Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks »
Yuanzhi Li · Colin Wei · Tengyu Ma -
2018 Poster: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Poster: Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression »
Neha Gupta · Aaron Sidford -
2018 Poster: Online Improper Learning with an Approximation Oracle »
Elad Hazan · Wei Hu · Yuanzhi Li · Zhiyuan Li -
2018 Oral: Optimal Algorithms for Non-Smooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee -
2018 Poster: NEON2: Finding Local Minima via First-Order Oracles »
Zeyuan Allen-Zhu · Yuanzhi Li -
2018 Poster: Is Q-Learning Provably Efficient? »
Chi Jin · Zeyuan Allen-Zhu · Sebastien Bubeck · Michael Jordan -
2018 Poster: Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model »
Aaron Sidford · Mengdi Wang · Xian Wu · Lin Yang · Yinyu Ye -
2018 Poster: Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data »
Yuanzhi Li · Yingyu Liang -
2018 Spotlight: Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data »
Yuanzhi Li · Yingyu Liang