Timezone: »
Spotlight
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov
Variational inequalities in general and saddle point problems in particular are increasingly relevant in machine learning applications, including adversarial learning, GANs, transport and robust optimization. With increasing data and problem sizes necessary to train high performing models across various applications, we need to rely on parallel and distributed computing. However, in distributed training, communication among the compute nodes is a key bottleneck during training, and this problem is exacerbated for high dimensional and over-parameterized models. Due to these considerations, it is important to equip existing methods with strategies that would allow to reduce the volume of transmitted information during training while obtaining a model of comparable quality. In this paper, we present the first theoretically grounded distributed methods for solving variational inequalities and saddle point problems using compressed communication: MASHA1 and MASHA2. Our theory and methods allow for the use of both unbiased (such as Rand$k$; MASHA1) and contractive (such as Top$k$; MASHA2) compressors. New algorithms support bidirectional compressions, and also can be modified for stochastic setting with batches and for federated learning with partial participation of clients. We empirically validated our conclusions using two experimental setups: a standard bilinear min-max problem, and large-scale distributed adversarial training of transformers.
Author Information
Aleksandr Beznosikov (Moscow Institute of Physics and Technology)
Peter Richtarik (KAUST)
Michael Diskin (Yandex, Higher School of Economics)
Max Ryabinin (Yandex, HSE University)
Alexander Gasnikov (Moscow Institute of Physics and Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Tue. Nov 29th 05:00 -- 07:00 PM Room Hall J #416
More from the Same Authors
-
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Random-reshuffled SARAH does not need a full gradient computations »
Aleksandr Beznosikov · Martin Takac -
2021 : Shifted Compression Framework: Generalizations and Improvements »
Egor Shulgin · Peter Richtarik -
2021 : EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin · Eduard Gorbunov · Zhize Li -
2021 : On Server-Side Stepsizes in Federated Optimization: Theory Explaining the Heuristics »
Grigory Malinovsky · Konstantin Mishchenko · Peter Richtarik -
2021 : Decentralized Personalized Federated Min-Max Problems »
Ekaterina Borodich · Aleksandr Beznosikov · Abdurakhmon Sadiev · Vadim Sushko · Alexander Gasnikov -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled · Samuel Horváth · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled · Samuel Horváth · Peter Richtarik -
2022 Poster: Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques »
Bokun Wang · Mher Safaryan · Peter Richtarik -
2022 : RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates »
Laurent Condat · Peter Richtarik -
2022 : Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods »
Aleksandr Beznosikov · Eduard Gorbunov · Hugo Berard · Nicolas Loizou -
2022 : Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation »
Rustem Islamov · Xun Qian · Slavomír Hanzely · Mher Safaryan · Peter Richtarik -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : Certified Robustness in Federated Learning »
Motasem Alfarra · Juan Perez · Egor Shulgin · Peter Richtarik · Bernard Ghanem -
2023 : Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization »
Hanmin Li · Avetik Karagulyan · Peter Richtarik -
2023 : Towards a Better Theoretical Understanding of Independent Subnetwork Training »
Egor Shulgin · Peter Richtarik -
2023 : Improved Stein Variational Gradient Descent with Importance Weights »
Lukang Sun · Peter Richtarik -
2023 : MARINA Meets Matrix Stepsizes: Variance Reduced Distributed Non-Convex Optimization »
Hanmin Li · Avetik Karagulyan · Peter Richtarik -
2023 : TAMUNA: Doubly Accelerated Federated Learning with Local Training, Compression, and Partial Participation »
Laurent Condat · Ivan Agarský · Grigory Malinovsky · Peter Richtarik -
2023 Poster: 2Direction: Theoretically Faster Distributed Training with Bidirectional Communication Compression »
Alexander Tyurin · Peter Richtarik -
2023 Poster: A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting »
Alexander Tyurin · Peter Richtarik -
2023 Poster: Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics »
Anton Voronov · Mikhail Khoroshikh · Artem Babenko · Max Ryabinin -
2023 Poster: Distributed Inference and Fine-tuning of Large Language Models Over The Internet »
Alexander Borzunov · Max Ryabinin · Artem Chumachenko · Dmitry Baranchuk · Tim Dettmers · Younes Belkada · Pavel Samygin · Colin Raffel -
2023 Poster: First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities »
Aleksandr Beznosikov · Sergey Samsonov · Marina Sheshukova · Alexander Gasnikov · Alexey Naumov · Eric Moulines -
2023 Poster: Optimal Time Complexities of Parallel Stochastic Optimization Methods Under a Fixed Computation Model »
Alexander Tyurin · Peter Richtarik -
2023 Poster: Momentum Provably Improves Error Feedback! »
Ilyas Fatkhullin · Alexander Tyurin · Peter Richtarik -
2023 Poster: Accelerated Zeroth-order Method for Non-Smooth Stochastic Convex Optimization Problem with Infinite Variance »
Nikita Kornilov · Ohad Shamir · Aleksandr Lobanov · Darina Dvinskikh · Alexander Gasnikov · Innokentiy Shibaev · Eduard Gorbunov · Samuel Horváth -
2023 Poster: Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities »
Aleksandr Beznosikov · Martin Takac · Alexander Gasnikov -
2023 Poster: A Guide Through the Zoo of Biased SGD »
Yury Demidovich · Grigory Malinovsky · Igor Sokolov · Peter Richtarik -
2022 Spotlight: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Spotlight: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Spotlight: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Spotlight: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Decentralized Local Stochastic Extra-Gradient for Variational Inequalities »
Aleksandr Beznosikov · Pavel Dvurechenskii · Anastasiia Koloskova · Valentin Samokhin · Sebastian Stich · Alexander Gasnikov -
2022 : Petals: Collaborative Inference and Fine-tuning of Large Models »
Alexander Borzunov · Dmitry Baranchuk · Tim Dettmers · Max Ryabinin · Younes Belkada · Artem Chumachenko · Pavel Samygin · Colin Raffel -
2022 : Petals: Collaborative Inference and Fine-tuning of Large Models »
Alexander Borzunov · Dmitry Baranchuk · Tim Dettmers · Max Ryabinin · Younes Belkada · Artem Chumachenko · Pavel Samygin · Colin Raffel -
2022 Workshop: Federated Learning: Recent Advances and New Challenges »
Shiqiang Wang · Nathalie Baracaldo · Olivia Choudhury · Gauri Joshi · Peter Richtarik · Praneeth Vepakomma · Han Yu -
2022 Poster: Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning »
Grigory Malinovsky · Kai Yi · Peter Richtarik -
2022 Poster: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Poster: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Poster: Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise »
Eduard Gorbunov · Marina Danilova · David Dobre · Pavel Dvurechenskii · Alexander Gasnikov · Gauthier Gidel -
2022 Poster: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression »
Haoyu Zhao · Boyue Li · Zhize Li · Peter Richtarik · Yuejie Chi -
2022 Poster: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in Distributed Optimization »
Laurent Condat · Kai Yi · Peter Richtarik -
2022 Poster: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Poster: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Poster: Decentralized Local Stochastic Extra-Gradient for Variational Inequalities »
Aleksandr Beznosikov · Pavel Dvurechenskii · Anastasiia Koloskova · Valentin Samokhin · Sebastian Stich · Alexander Gasnikov -
2021 : Poster Session 2 (gather.town) »
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov -
2021 : Q&A with Professor Peter Richtarik »
Peter Richtarik -
2021 : Keynote Talk: Permutation Compressors for Provably Faster Distributed Nonconvex Optimization (Peter Richtarik) »
Peter Richtarik -
2021 Poster: Distributed Deep Learning In Open Collaborations »
Michael Diskin · Alexey Bukhtiyarov · Max Ryabinin · Lucile Saulnier · quentin lhoest · Anton Sinitsin · Dmitry Popov · Dmitry V. Pyrkin · Maxim Kashirin · Alexander Borzunov · Albert Villanova del Moral · Denis Mazur · Ilia Kobelev · Yacine Jernite · Thomas Wolf · Gennady Pekhimenko -
2021 Poster: Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices »
Max Ryabinin · Eduard Gorbunov · Vsevolod Plokhotnyuk · Gennady Pekhimenko -
2021 Poster: Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization »
Mher Safaryan · Filip Hanzely · Peter Richtarik -
2021 Poster: Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets »
Max Ryabinin · Andrey Malinin · Mark Gales -
2021 Poster: Distributed Saddle-Point Problems Under Data Similarity »
Aleksandr Beznosikov · Gesualdo Scutari · Alexander Rogozin · Alexander Gasnikov -
2021 Poster: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2021 Poster: Error Compensated Distributed SGD Can Be Accelerated »
Xun Qian · Peter Richtarik · Tong Zhang -
2021 Poster: CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression »
Zhize Li · Peter Richtarik -
2021 Poster: Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks »
Dmitry Kovalev · Elnur Gasanov · Alexander Gasnikov · Peter Richtarik -
2021 : Training Transformers Together »
Alexander Borzunov · Max Ryabinin · Tim Dettmers · quentin lhoest · Lucile Saulnier · Michael Diskin · Yacine Jernite · Thomas Wolf -
2021 Oral: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2020 : Poster Session 1 (gather.town) »
Laurent Condat · Tiffany Vlaar · Ohad Shamir · Mohammadi Zaki · Zhize Li · Guan-Horng Liu · Samuel Horváth · Mher Safaryan · Yoni Choukroun · Kumar Shridhar · Nabil Kahale · Jikai Jin · Pratik Kumar Jawanpuria · Gaurav Kumar Yadav · Kazuki Koyama · Junyoung Kim · Xiao Li · Saugata Purkayastha · Adil Salim · Dighanchal Banerjee · Peter Richtarik · Lakshman Mahto · Tian Ye · Bamdev Mishra · Huikang Liu · Jiajie Zhu -
2020 Poster: Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts »
Max Ryabinin · Anton Gusev -
2020 Poster: Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm »
Adil Salim · Peter Richtarik -
2020 Poster: Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping »
Eduard Gorbunov · Marina Danilova · Alexander Gasnikov -
2020 Poster: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Poster: Random Reshuffling: Simple Analysis with Vast Improvements »
Konstantin Mishchenko · Ahmed Khaled · Peter Richtarik -
2020 Spotlight: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Session: Orals & Spotlights Track 21: Optimization »
Peter Richtarik · Marco Cuturi -
2020 Poster: Lower Bounds and Optimal Algorithms for Personalized Federated Learning »
Filip Hanzely · Slavomír Hanzely · Samuel Horváth · Peter Richtarik -
2020 Poster: Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization »
Dmitry Kovalev · Adil Salim · Peter Richtarik -
2019 Poster: RSN: Randomized Subspace Newton »
Robert Gower · Dmitry Kovalev · Felix Lieder · Peter Richtarik -
2019 Poster: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2019 Spotlight: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2018 Poster: Stochastic Spectral and Conjugate Descent Methods »
Dmitry Kovalev · Peter Richtarik · Eduard Gorbunov · Elnur Gasanov -
2018 Poster: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2018 Spotlight: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2018 Poster: Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization »
Robert Gower · Filip Hanzely · Peter Richtarik · Sebastian Stich -
2018 Poster: SEGA: Variance Reduction via Gradient Sketching »
Filip Hanzely · Konstantin Mishchenko · Peter Richtarik -
2016 Poster: Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods »
Lev Bogolubsky · Pavel Dvurechenskii · Alexander Gasnikov · Gleb Gusev · Yurii Nesterov · Andrei M Raigorodskii · Aleksey Tikhonov · Maksim Zhukovskii -
2015 Poster: Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling »
Zheng Qu · Peter Richtarik · Tong Zhang