Timezone: »
We consider distributed stochastic variational inequalities (VIs) on unbounded domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that, in particular, covers the settings of fully decentralized calculations with time-varying networks and centralized topologies commonly used in Federated Learning. Moreover, multiple local updates on the workers can be made for reducing the communication frequency between the workers.We extend the stochastic extragradient method to this very general setting and theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone (when a Minty solution exists) settings. The provided rates explicitly exhibit the dependence on network characteristics (e.g., mixing time), iteration counter, data heterogeneity, variance, number of devices, and other standard parameters. As a special case, our method and analysis apply to distributed stochastic saddle-point problems (SPP), e.g., to the training of Deep Generative Adversarial Networks (GANs) for which decentralized training has been reported to be extremely challenging. In experiments for the decentralized training of GANs we demonstrate the effectiveness of our proposed approach.
Author Information
Aleksandr Beznosikov (Moscow Institute of Physics and Technology)
Pavel Dvurechenskii (Weierstrass Institute, Berlin)
Since 2015 Research assistant, Research Group 6 "Stochastic Algorithms and Nonparametric Statistics", Weierstrass Institute for Applied Analysis and Stochastics, Berlin 2014 - 2015 Research assistant, Institute for Information Transmission Problems, Moscow, Russia 2009 - 2015 Junior researcher, Moscow Institute of Physics and Technology, Moscow, Russia 2013 Ph.D., Moscow Institute of Physics and Technology, Moscow, Russia 2010 Master's Diploma, Moscow Institute of Physics and Technology, Moscow, Russia 2008 Bachelor's Diploma, Moscow Institute of Physics and Technology, Moscow, Russia
Anastasiia Koloskova (EPFL)
Valentin Samokhin (Skolkovo Institute of Science and Technology)
Sebastian Stich (CISPA)
Alexander Gasnikov (Moscow Institute of Physics and Technology)
More from the Same Authors
-
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : Decentralized Personalized Federated Learning: Lower Bounds and Optimal Algorithm for All Personalization Modes »
Abdurakhmon Sadiev · Ekaterina Borodich · Darina Dvinskikh · Aleksandr Beznosikov · Alexander Gasnikov -
2021 : Random-reshuffled SARAH does not need a full gradient computations »
Aleksandr Beznosikov · Martin Takac -
2021 : Escaping Local Minima With Stochastic Noise »
Harshvardhan Harshvardhan · Sebastian Stich -
2021 : Decentralized Personalized Federated Min-Max Problems »
Ekaterina Borodich · Aleksandr Beznosikov · Abdurakhmon Sadiev · Vadim Sushko · Alexander Gasnikov -
2021 : The Peril of Popular Deep Learning Uncertainty Estimation Methods »
Yehao Liu · Matteo Pagliardini · Tatjana Chavdarova · Sebastian Stich -
2022 : Data-heterogeneity-aware Mixing for Decentralized Learning »
Yatin Dandi · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2022 : Bidirectional Adaptive Communication for Heterogeneous Distributed Learning »
Dmitrii Avdiukhin · Vladimir Braverman · Nikita Ivkin · Sebastian Stich -
2022 : Decentralized Stochastic Optimization with Client Sampling »
Ziwei Liu · Anastasiia Koloskova · Martin Jaggi · Tao Lin -
2022 : Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods »
Aleksandr Beznosikov · Eduard Gorbunov · Hugo Berard · Nicolas Loizou -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : Preserving privacy with PATE for heterogeneous data »
Akshay Dodwadmath · Sebastian Stich -
2022 Spotlight: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Panel: Panel 4B-2: Decentralized Training of… & Sharper Convergence Guarantees… »
Anastasiia Koloskova · Tianyi Zhang -
2022 Spotlight: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2022 Spotlight: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Spotlight: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Spotlight: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Spotlight: Decentralized Local Stochastic Extra-Gradient for Variational Inequalities »
Aleksandr Beznosikov · Pavel Dvurechenskii · Anastasiia Koloskova · Valentin Samokhin · Sebastian Stich · Alexander Gasnikov -
2022 Workshop: OPT 2022: Optimization for Machine Learning »
Courtney Paquette · Sebastian Stich · Quanquan Gu · Cristóbal Guzmán · John Duchi -
2022 Poster: Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2022 Poster: Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity »
Dmitry Kovalev · Aleksandr Beznosikov · Ekaterina Borodich · Alexander Gasnikov · Gesualdo Scutari -
2022 Poster: Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise »
Eduard Gorbunov · Marina Danilova · David Dobre · Pavel Dvurechenskii · Alexander Gasnikov · Gauthier Gidel -
2022 Poster: The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization »
Dmitry Kovalev · Alexander Gasnikov -
2022 Poster: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Poster: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Poster: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2021 : Poster Session 2 (gather.town) »
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov -
2021 : Contributed Talks in Session 1 (Zoom) »
Sebastian Stich · Futong Liu · Abdurakhmon Sadiev · Frederik Benzing · Simon Roburin -
2021 : Opening Remarks to Session 1 »
Sebastian Stich -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2021 Poster: Distributed Saddle-Point Problems Under Data Similarity »
Aleksandr Beznosikov · Gesualdo Scutari · Alexander Rogozin · Alexander Gasnikov -
2021 Poster: Breaking the centralized barrier for cross-device federated learning »
Sai Praneeth Karimireddy · Martin Jaggi · Satyen Kale · Mehryar Mohri · Sashank Reddi · Sebastian Stich · Ananda Theertha Suresh -
2021 Poster: Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks »
Dmitry Kovalev · Elnur Gasanov · Alexander Gasnikov · Peter Richtarik -
2021 Poster: RelaySum for Decentralized Deep Learning on Heterogeneous Data »
Thijs Vogels · Lie He · Anastasiia Koloskova · Sai Praneeth Karimireddy · Tao Lin · Sebastian Stich · Martin Jaggi -
2021 Poster: An Improved Analysis of Gradient Tracking for Decentralized Machine Learning »
Anastasiia Koloskova · Tao Lin · Sebastian Stich -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Contributed talks in Session 1 (Zoom) »
Sebastian Stich · Laurent Condat · Zhize Li · Ohad Shamir · Tiffany Vlaar · Mohammadi Zaki -
2020 : Live Q&A with Volkan Cevher (Zoom) »
Sebastian Stich -
2020 : Live Q&A with Tong Zhang (Zoom) »
Sebastian Stich -
2020 : Welcome remarks to Session 1 »
Sebastian Stich -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 Poster: Ensemble Distillation for Robust Model Fusion in Federated Learning »
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi -
2020 Poster: Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping »
Eduard Gorbunov · Marina Danilova · Alexander Gasnikov -
2018 : Poster spotlight »
Tianbao Yang · Pavel Dvurechenskii · Panayotis Mertikopoulos · Hugo Berard -
2018 Poster: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2018 Spotlight: Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters »
Pavel Dvurechenskii · Darina Dvinskikh · Alexander Gasnikov · Cesar Uribe · Angelia Nedich -
2018 Poster: Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization »
Robert Gower · Filip Hanzely · Peter Richtarik · Sebastian Stich -
2018 Poster: Sparsified SGD with Memory »
Sebastian Stich · Jean-Baptiste Cordonnier · Martin Jaggi -
2017 Poster: Safe Adaptive Importance Sampling »
Sebastian Stich · Anant Raj · Martin Jaggi -
2017 Spotlight: Safe Adaptive Importance Sampling »
Sebastian Stich · Anant Raj · Martin Jaggi -
2016 Poster: Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods »
Lev Bogolubsky · Pavel Dvurechenskii · Alexander Gasnikov · Gleb Gusev · Yurii Nesterov · Andrei M Raigorodskii · Aleksey Tikhonov · Maksim Zhukovskii