Timezone: »
Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial, but the most popular algorithm, Stochastic Gradient Descent (SGD), is a serial method that is surprisingly hard to parallelize. In this paper, we propose an efficient distributed stochastic optimization method by combining adaptivity with variance reduction techniques. Our analysis yields a linear speedup in the number of machines, constant memory footprint, and only a logarithmic number of communication rounds. Critically, our approach is a black-box reduction that parallelizes any serial online learning algorithm, streamlining prior analysis and allowing us to leverage the significant progress that has been made in designing adaptive algorithms. In particular, we achieve optimal convergence rates without any prior knowledge of smoothness parameters, yielding a more robust algorithm that reduces the need for hyperparameter tuning. We implement our algorithm in the Spark distributed framework and exhibit dramatic performance gains on large-scale logistic regression problems.
Author Information
Ashok Cutkosky (Google)
Róbert Busa-Fekete (Yahoo! Research)
More from the Same Authors
-
2021 Spotlight: Online Selective Classification with Limited Feedback »
Aditya Gangrade · Anil Kag · Ashok Cutkosky · Venkatesh Saligrama -
2021 : Population Level Privacy Leakage in Binary Classification wtih Label Noise »
Róbert Busa-Fekete · Andres Munoz · Umar Syed · Sergei Vassilvitskii -
2021 : On the Pitfalls of Label Differential Privacy »
Andres Munoz · Róbert Busa-Fekete · Umar Syed · Sergei Vassilvitskii -
2021 : Population Level Privacy Leakage in Binary Classification wtih Label Noise »
Róbert Busa-Fekete · Andres Munoz · Umar Syed · Sergei Vassilvitskii -
2021 Oral: High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails »
Ashok Cutkosky · Harsh Mehta -
2021 Poster: High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails »
Ashok Cutkosky · Harsh Mehta -
2021 Poster: Online Selective Classification with Limited Feedback »
Aditya Gangrade · Anil Kag · Ashok Cutkosky · Venkatesh Saligrama -
2021 Poster: Identity testing for Mallows model »
Róbert Busa-Fekete · Dimitris Fotakis · Balazs Szorenyi · Emmanouil Zampetakis -
2021 Poster: Logarithmic Regret from Sublinear Hints »
Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit -
2021 Poster: Private and Non-private Uniformity Testing for Ranking Data »
Róbert Busa-Fekete · Dimitris Fotakis · Emmanouil Zampetakis -
2021 : On the Pitfalls of Label Differential Privacy »
Andres Munoz · Róbert Busa-Fekete · Umar Syed · Sergei Vassilvitskii -
2018 Poster: A no-regret generalization of hierarchical softmax to extreme multi-label classification »
Marek Wydmuch · Kalina Jasinska-Kobus · Mikhail Kuznetsov · Róbert Busa-Fekete · Krzysztof Dembczynski -
2017 Poster: Stochastic and Adversarial Online Learning without Hyperparameters »
Ashok Cutkosky · Kwabena A Boahen -
2016 Poster: Online Convex Optimization with Unconstrained Domains and Losses »
Ashok Cutkosky · Kwabena A Boahen -
2015 Poster: Online F-Measure Optimization »
Róbert Busa-Fekete · Balázs Szörényi · Krzysztof Dembczynski · Eyke Hüllermeier -
2015 Poster: Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach »
Balázs Szörényi · Róbert Busa-Fekete · Adil Paul · Eyke Hüllermeier