Timezone: »
Generative Adversarial Networks (GANs) are a powerful class of generative models in the deep learning community. Current practice on large-scale GAN training utilizes large models and distributed large-batch training strategies, and is implemented on deep learning frameworks (e.g., TensorFlow, PyTorch, etc.) designed in a centralized manner. In the centralized network topology, every worker needs to either directly communicate with the central node or indirectly communicate with all other workers in every iteration. However, when the network bandwidth is low or network latency is high, the performance would be significantly degraded. Despite recent progress on decentralized algorithms for training deep neural networks, it remains unclear whether it is possible to train GANs in a decentralized manner. The main difficulty lies at handling the nonconvex-nonconcave min-max optimization and the decentralized communication simultaneously. In this paper, we address this difficulty by designing the \textbf{first gradient-based decentralized parallel algorithm} which allows workers to have multiple rounds of communications in one iteration and to update the discriminator and generator simultaneously, and this design makes it amenable for the convergence analysis of the proposed decentralized algorithm. Theoretically, our proposed decentralized algorithm is able to solve a class of non-convex non-concave min-max problems with provable non-asymptotic convergence to first-order stationary point. Experimental results on GANs demonstrate the effectiveness of the proposed algorithm.
Author Information
Mingrui Liu (Boston University)
Wei Zhang (IBM T.J.Watson Research Center)
BE Beijing Univ of Technology 2005 MSc Technical University of Denmark 2008 PhD University of Wisconsin, Madison 2013 All in computer science Published papers in ASPLOS, OOPSLA, OSDI, PLDI, IJCAI, ICDM, NIPS
Youssef Mroueh (IBM T.J Watson Research Center)
Xiaodong Cui (IBM T. J. Watson Research Center)
Jarret Ross (IBM)
Tianbao Yang (The University of Iowa)
Payel Das (IBM Research)
More from the Same Authors
-
2020 Poster: Unbalanced Sobolev Descent »
Youssef Mroueh · Mattia Rigotti -
2020 Poster: Improved Schemes for Episodic Memory-based Lifelong Learning »
Yunhui Guo · Mingrui Liu · Tianbao Yang · Tajana Rosing -
2020 Spotlight: Improved Schemes for Episodic Memory-based Lifelong Learning »
Yunhui Guo · Mingrui Liu · Tianbao Yang · Tajana Rosing -
2020 Poster: Ultra-Low Precision 4-bit Training of Deep Neural Networks »
Xiao Sun · Naigang Wang · Chia-Yu Chen · Jiamin Ni · Ankur Agrawal · Xiaodong Cui · Swagath Venkataramani · Kaoutar El Maghraoui · Vijayalakshmi (Viji) Srinivasan · Kailash Gopalakrishnan -
2020 Poster: ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training »
Chia-Yu Chen · Jiamin Ni · Songtao Lu · Xiaodong Cui · Pin-Yu Chen · Xiao Sun · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Wei Zhang · Kailash Gopalakrishnan -
2020 Oral: Ultra-Low Precision 4-bit Training of Deep Neural Networks »
Xiao Sun · Naigang Wang · Chia-Yu Chen · Jiamin Ni · Ankur Agrawal · Xiaodong Cui · Swagath Venkataramani · Kaoutar El Maghraoui · Vijayalakshmi (Viji) Srinivasan · Kailash Gopalakrishnan -
2020 Poster: Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization »
Yan Yan · Yi Xu · Qihang Lin · Wei Liu · Tianbao Yang -
2020 Poster: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models »
Vijil Chenthamarakshan · Payel Das · Samuel Hoffman · Hendrik Strobelt · Inkit Padhi · Kar Wai Lim · Benjamin Hoover · Matteo Manica · Jannis Born · Teodoro Laino · Aleksandra Mojsilovic -
2020 Poster: Optimizing Mode Connectivity via Neuron Alignment »
Norman J Tatro · Pin-Yu Chen · Payel Das · Igor Melnyk · Prasanna Sattigeri · Rongjie Lai -
2020 Expo Talk Panel: AI against COVID-19 at IBM Research »
Divya Pathak · Payel Das · Michal Rosen-Zvi · Salim Roukos -
2019 Poster: Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks »
Xiao Sun · Jungwook Choi · Chia-Yu Chen · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Xiaodong Cui · Wei Zhang · Kailash Gopalakrishnan -
2019 Poster: Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems »
Yi Xu · Rong Jin · Tianbao Yang -
2019 Poster: Stagewise Training Accelerates Convergence of Testing Error Over SGD »
Zhuoning Yuan · Yan Yan · Rong Jin · Tianbao Yang -
2019 Poster: Sobolev Independence Criterion »
Youssef Mroueh · Tom Sercu · Mattia Rigotti · Inkit Padhi · Cicero Nogueira dos Santos -
2018 Poster: First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time »
Yi Xu · Rong Jin · Tianbao Yang -
2018 Poster: Adaptive Negative Curvature Descent with Applications in Non-convex Optimization »
Mingrui Liu · Zhe Li · Xiaoyu Wang · Jinfeng Yi · Tianbao Yang -
2018 Poster: Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization »
Xiaoxuan Zhang · Mingrui Liu · Xun Zhou · Tianbao Yang -
2018 Poster: Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions »
Mingrui Liu · Xiaoxuan Zhang · Lijun Zhang · Rong Jin · Tianbao Yang -
2018 Poster: Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives »
Amit Dhurandhar · Pin-Yu Chen · Ronny Luss · Chun-Chen Tu · Paishun Ting · Karthikeyan Shanmugam · Payel Das -
2018 Poster: Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks »
Xiaodong Cui · Wei Zhang · Zoltán Tüske · Michael Picheny -
2017 Poster: ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization »
Yi Xu · Mingrui Liu · Qihang Lin · Tianbao Yang -
2017 Poster: Fisher GAN »
Youssef Mroueh · Tom Sercu -
2017 Poster: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Oral: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Poster: Improved Dynamic Regret for Non-degenerate Functions »
Lijun Zhang · Tianbao Yang · Jinfeng Yi · Rong Jin · Zhi-Hua Zhou -
2017 Poster: Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition »
Mingrui Liu · Tianbao Yang -
2017 Poster: Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter »
Yi Xu · Qihang Lin · Tianbao Yang -
2017 Poster: Dilated Recurrent Neural Networks »
Shiyu Chang · Yang Zhang · Wei Han · Mo Yu · Xiaoxiao Guo · Wei Tan · Xiaodong Cui · Michael Witbrock · Mark Hasegawa-Johnson · Thomas Huang -
2016 Poster: Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$ »
Yi Xu · Yan Yan · Qihang Lin · Tianbao Yang -
2016 Poster: Improved Dropout for Shallow and Deep Learning »
Zhe Li · Boqing Gong · Tianbao Yang