Timezone: »
In distributed or federated optimization and learning, communication between the different computing units is often the bottleneck and gradient compression is widely used to reduce the number of bits sent within each communication round of iterative methods. There are two classes of compression operators and separate algorithms making use of them. In the case of unbiased random compressors with bounded variance (e.g., rand-k), the DIANA algorithm of Mishchenko et al. (2019), which implements a variance reduction technique for handling the variance introduced by compression, is the current state of the art. In the case of biased and contractive compressors (e.g., top-k), the EF21 algorithm of Richtárik et al. (2021), which instead implements an error-feedback mechanism, is the current state of the art. These two classes of compression schemes and algorithms are distinct, with different analyses and proof techniques. In this paper, we unify them into a single framework and propose a new algorithm, recovering DIANA and EF21 as particular cases. Our general approach works with a new, larger class of compressors, which has two parameters, the bias and the variance, and includes unbiased and biased compressors as particular cases. This allows us to inherit the best of the two worlds: like EF21 and unlike DIANA, biased compressors, like top-k, whose good performance in practice is recognized, can be used. And like DIANA and unlike EF21, independent randomness at the compressors allows to mitigate the effects of compression, with the convergence rate improving when the number of parallel workers is large. This is the first time that an algorithm with all these features is proposed. We prove its linear convergence under certain conditions. Our approach takes a step towards better understanding of two so-far distinct worlds of communication-efficient distributed learning.
Author Information
Laurent Condat (KAUST)
Kai Yi (KAUST)
I’m a PhD student under the supervision of Prof. Peter Richtárik. Before that, I received my Master from KAUST in Dec. 2021 and B.Eng with honor from Xi’an Jiaotong University in June 2019. I’ve interned at Tencent AI Lab, CMU Xulab, NUS CVML Group, and SenseTime.
Peter Richtarik (KAUST)
More from the Same Authors
-
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Shifted Compression Framework: Generalizations and Improvements »
Egor Shulgin · Peter Richtarik -
2021 : EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin · Eduard Gorbunov · Zhize Li -
2021 : On Server-Side Stepsizes in Federated Optimization: Theory Explaining the Heuristics »
Grigory Malinovsky · Konstantin Mishchenko · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled Ragab Bayoumi · Samuel Horváth · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled Ragab Bayoumi · Samuel Horváth · Peter Richtarik -
2022 Poster: Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques »
Bokun Wang · Mher Safaryan · Peter Richtarik -
2022 : RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates »
Laurent Condat · Peter Richtarik -
2022 : Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation »
Rustem Islamov · Xun Qian · Slavomír Hanzely · Mher Safaryan · Peter Richtarik -
2022 : Certified Robustness in Federated Learning »
Motasem Alfarra · Juan Perez · Egor Shulgin · Peter Richtarik · Bernard Ghanem -
2022 Spotlight: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Spotlight: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Spotlight: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2022 Spotlight: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 : Poster Session 1 »
Andrew Lowy · Thomas Bonnier · Yiling Xie · Guy Kornowski · Simon Schug · Seungyub Han · Nicolas Loizou · xinwei zhang · Laurent Condat · Tabea E. Röber · Si Yi Meng · Marco Mondelli · Runlong Zhou · Eshaan Nichani · Adrian Goldwaser · Rudrajit Das · Kayhan Behdin · Atish Agarwala · Mukul Gagrani · Gary Cheng · Tian Li · Haoran Sun · Hossein Taheri · Allen Liu · Siqi Zhang · Dmitrii Avdiukhin · Bradley Brown · Miaolan Xie · Junhyung Lyle Kim · Sharan Vaswani · Xinmeng Huang · Ganesh Ramachandra Kini · Angela Yuan · Weiqiang Zheng · Jiajin Li -
2022 Workshop: Federated Learning: Recent Advances and New Challenges »
Shiqiang Wang · Nathalie Baracaldo · Olivia Choudhury · Gauri Joshi · Peter Richtarik · Praneeth Vepakomma · Han Yu -
2022 Poster: Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning »
Grigory Malinovsky · Kai Yi · Peter Richtarik -
2022 Poster: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression »
Haoyu Zhao · Boyue Li · Zhize Li · Peter Richtarik · Yuejie Chi -
2022 Poster: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Poster: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Poster: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2021 : Q&A with Professor Peter Richtarik »
Peter Richtarik -
2021 : Keynote Talk: Permutation Compressors for Provably Faster Distributed Nonconvex Optimization (Peter Richtarik) »
Peter Richtarik -
2021 Poster: Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization »
Mher Safaryan · Filip Hanzely · Peter Richtarik -
2021 Poster: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2021 Poster: Error Compensated Distributed SGD Can Be Accelerated »
Xun Qian · Peter Richtarik · Tong Zhang -
2021 Poster: CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression »
Zhize Li · Peter Richtarik -
2021 Poster: Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks »
Dmitry Kovalev · Elnur Gasanov · Alexander Gasnikov · Peter Richtarik -
2021 Oral: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2020 : Poster Session 1 (gather.town) »
Laurent Condat · Tiffany Vlaar · Ohad Shamir · Mohammadi Zaki · Zhize Li · Guan-Horng Liu · Samuel Horváth · Mher Safaryan · Yoni Choukroun · Kumar Shridhar · Nabil Kahale · Jikai Jin · Pratik Kumar Jawanpuria · Gaurav Kumar Yadav · Kazuki Koyama · Junyoung Kim · Xiao Li · Saugata Purkayastha · Adil Salim · Dighanchal Banerjee · Peter Richtarik · Lakshman Mahto · Tian Ye · Bamdev Mishra · Huikang Liu · Jiajie Zhu -
2020 : Contributed talks in Session 1 (Zoom) »
Sebastian Stich · Laurent Condat · Zhize Li · Ohad Shamir · Tiffany Vlaar · Mohammadi Zaki -
2020 : Contributed Video: Distributed Proximal Splitting Algorithms with Rates and Acceleration, Laurent Condat »
Laurent Condat -
2020 Poster: Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm »
Adil Salim · Peter Richtarik -
2020 Poster: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Poster: Random Reshuffling: Simple Analysis with Vast Improvements »
Konstantin Mishchenko · Ahmed Khaled Ragab Bayoumi · Peter Richtarik -
2020 Spotlight: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Session: Orals & Spotlights Track 21: Optimization »
Peter Richtarik · Marco Cuturi -
2020 Poster: Lower Bounds and Optimal Algorithms for Personalized Federated Learning »
Filip Hanzely · Slavomír Hanzely · Samuel Horváth · Peter Richtarik -
2020 Poster: Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization »
Dmitry Kovalev · Adil Salim · Peter Richtarik -
2019 Poster: RSN: Randomized Subspace Newton »
Robert Gower · Dmitry Kovalev · Felix Lieder · Peter Richtarik -
2019 Poster: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2019 Spotlight: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2018 Poster: Stochastic Spectral and Conjugate Descent Methods »
Dmitry Kovalev · Peter Richtarik · Eduard Gorbunov · Elnur Gasanov -
2018 Poster: Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization »
Robert Gower · Filip Hanzely · Peter Richtarik · Sebastian Stich -
2018 Poster: SEGA: Variance Reduction via Gradient Sketching »
Filip Hanzely · Konstantin Mishchenko · Peter Richtarik -
2015 Poster: Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling »
Zheng Qu · Peter Richtarik · Tong Zhang