Timezone: »
SGD with momentum (SGDM) has been widely applied in many machine learning tasks, and it is often applied with dynamic stepsizes and momentum weights tuned in a stagewise manner. Despite of its empirical advantage over SGD, the role of momentum is still unclear in general since previous analyses on SGDM either provide worse convergence bounds than those of SGD, or assume Lipschitz or quadratic objectives, which fail to hold in practice. Furthermore, the role of dynamic parameters has not been addressed. In this work, we show that SGDM converges as fast as SGD for smooth objectives under both strongly convex and nonconvex settings. We also prove that multistage strategy is beneficial for SGDM compared to using fixed parameters. Finally, we verify these theoretical claims by numerical experiments.
Author Information
Yanli Liu (UCLA)
Yuan Gao (Columbia University)
I am a PhD student in Operations Research (IEOR) at Columbia University working on large-scale optimization in machine learning and decision-under-uncertainty. I have also passed the CFA Level I Exam.
Wotao Yin (Alibaba US, DAMO Academy)
More from the Same Authors
-
2021 Spotlight: Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems »
Tianyi Chen · Yuejiao Sun · Wotao Yin -
2021 : Practice-Consistent Analysis of Adam-Style Methods »
Zhishuai Guo · Yi Xu · Wotao Yin · Rong Jin · Tianbao Yang -
2022 Poster: Nonstationary Dual Averaging and Online Fair Allocation »
Luofeng Liao · Yuan Gao · Christian Kroer -
2021 Poster: Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems »
Tianyi Chen · Yuejiao Sun · Wotao Yin -
2021 Poster: Online Market Equilibrium with Application to Fair Division »
Yuan Gao · Alex Peysakhovich · Christian Kroer -
2021 Poster: Hyperparameter Tuning is All You Need for LISTA »
Xiaohan Chen · Jialin Liu · Zhangyang Wang · Wotao Yin -
2021 Poster: Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection »
HanQin Cai · Jialin Liu · Wotao Yin -
2021 Poster: Exponential Graph is Provably Efficient for Decentralized Deep Training »
Bicheng Ying · Kun Yuan · Yiming Chen · Hanbin Hu · PAN PAN · Wotao Yin -
2021 Poster: An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders »
Xinmeng Huang · Kun Yuan · Xianghui Mao · Wotao Yin -
2020 Poster: First-Order Methods for Large-Scale Market Equilibrium Computation »
Yuan Gao · Christian Kroer -
2020 Poster: An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods »
Yanli Liu · Kaiqing Zhang · Tamer Basar · Wotao Yin -
2020 Poster: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning »
Fei Feng · Ruosong Wang · Wotao Yin · Simon Du · Lin Yang -
2020 Spotlight: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning »
Fei Feng · Ruosong Wang · Wotao Yin · Simon Du · Lin Yang -
2018 Poster: Breaking the Span Assumption Yields Fast Finite-Sum Minimization »
Robert Hannah · Yanli Liu · Daniel O'Connor · Wotao Yin