Timezone: »
We analyze new online gradient descent algorithms for distributed systems with large delays between gradient computations and the corresponding updates. Using insights from adaptive gradient methods, we develop algorithms that adapt not only to the sequence of gradients, but also to the precise update delays that occur. We first give an impractical algorithm that achieves a regret bound that precisely quantifies the impact of the delays. We then analyze AdaptiveRevision, an algorithm that is efficiently implementable and achieves comparable guarantees. The key algorithmic technique is appropriately and efficiently revising the learning rate used for previous gradient steps. Experimental results show when the delays grow large (1000 updates or more), our new algorithms perform significantly better than standard adaptive gradient methods.
Author Information
Brendan McMahan (Google Research)
Matthew Streeter (Duolingo)
More from the Same Authors
-
2021 Poster: Differentially Private Learning with Adaptive Clipping »
Galen Andrew · Om Thakkar · Brendan McMahan · Swaroop Ramaswamy -
2020 Tutorial: (Track1) Federated Learning and Analytics: Industry Meets Academia Q&A »
Peter Kairouz · Brendan McMahan · Virginia Smith -
2020 Poster: Privacy Amplification via Random Check-Ins »
Borja Balle · Peter Kairouz · Brendan McMahan · Om Thakkar · Abhradeep Guha Thakurta -
2020 Tutorial: (Track1) Federated Learning and Analytics: Industry Meets Academia »
Brendan McMahan · Virginia Smith · Peter Kairouz -
2019 : Privacy for Federated Learning, and Federated Learning for Privacy »
Brendan McMahan -
2019 Workshop: Workshop on Federated Learning for Data Privacy and Confidentiality »
Lixin Fan · Jakub Konečný · Yang Liu · Brendan McMahan · Virginia Smith · Han Yu -
2018 : Brendan McMahan »
Brendan McMahan -
2018 Poster: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Spotlight: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Poster: cpSGD: Communication-efficient and differentially-private distributed SGD »
Naman Agarwal · Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2018 Spotlight: cpSGD: Communication-efficient and differentially-private distributed SGD »
Naman Agarwal · Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2013 Poster: Minimax Optimal Algorithms for Unconstrained Linear Optimization »
Brendan McMahan · Jacob D Abernethy -
2013 Poster: Estimation, Optimization, and Parallelism when Data is Sparse »
John Duchi · Michael Jordan · Brendan McMahan -
2012 Poster: No-Regret Algorithms for Unconstrained Online Convex Optimization »
Matthew Streeter · Brendan McMahan