Timezone: »
This paper proposes a new easy-to-implement parameter-free gradient-based optimizer: DoWG (Distance over Weighted Gradients). We prove that DoWG is efficient---matching the convergence rate of optimally tuned gradient descent in convex optimization up to a logarithmic factor without tuning any parameters, and universal---automatically adapting to both smooth and nonsmooth problems. While popular algorithms following the AdaGrad framework compute a running average of the squared gradients, DoWG maintains a new distance-based weighted version of the running average, which is crucial to achieve the desired properties. To complement our theory, we also show empirically that DoWG trains at the edge of stability, and validate its effectiveness on practical machine learning tasks.
Author Information
Ahmed Khaled (Princeton University)
Konstantin Mishchenko (Samsung)
Chi Jin (Princeton University)
More from the Same Authors
-
2021 : On Server-Side Stepsizes in Federated Optimization: Theory Explaining the Heuristics »
Grigory Malinovsky · Konstantin Mishchenko · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled · Samuel Horváth · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled · Samuel Horváth · Peter Richtarik -
2022 : Parameter Free Dual Averaging: Optimizing Lipschitz Functions in a Single Pass »
Aaron Defazio · Konstantin Mishchenko -
2023 : Noise Injection Irons Out Local Minima and Saddle Points »
Konstantin Mishchenko · Sebastian Stich -
2023 : A novel analysis of gradient descent under directional smoothness »
Aaron Mishkin · Ahmed Khaled · Aaron Defazio · Robert Gower -
2023 : Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift »
Jiawei Ge · Shange Tang · Jianqing Fan · Cong Ma · Chi Jin -
2023 : Poster Session 2 »
Xiao-Yang Liu · Guy Kornowski · Philipp Dahlinger · Abbas Ehsanfar · Binyamin Perets · David Martinez-Rubio · Sudeep Raja Putta · Runlong Zhou · Connor Lawless · Julian J Stier · Chen Fan · Michal Šustr · James Spann · Jung Hun Oh · Yao Xie · Qi Zhang · Krishna Acharya · Sourabh Medapati · Sharan Vaswani · Sruthi Gorantla · Darshan Chakrabarti · Mohamed Elsayed · Hongyang Zhang · Reza Asad · Viktor Pavlovic · Betty Shea · Georgy Noarov · Chuan He · Daniil Vankov · Taoan Huang · Michael Lu · Anant Mathur · Konstantin Mishchenko · Stanley Wei · Francesco Faccio · Yuchen Zeng · Tianyue Zhang · Chris Junchi Li · Aaron Mishkin · Sina Baharlouei · Chen Xu · Sasha Abramowitz · Sebastian Stich -
2023 Poster: Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL »
Qinghua Liu · Gellert Weisz · András György · Chi Jin · Csaba Szepesvari -
2023 Poster: Is RLHF More Difficult than Standard RL? A Theoretical Perspective »
Yuanhao Wang · Qinghua Liu · Chi Jin -
2023 Poster: Context-lumpable stochastic bandits »
Chung-Wei Lee · Qinghua Liu · Yasin Abbasi Yadkori · Chi Jin · Tor Lattimore · Csaba Szepesvari -
2022 : Asynchronous Optimization: Delays, Stability, and the Impact of Data Heterogeneity »
Konstantin Mishchenko -
2022 Poster: Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent »
Yu Bai · Chi Jin · Song Mei · Ziang Song · Tiancheng Yu -
2022 Poster: Sample-Efficient Reinforcement Learning of Partially Observable Markov Games »
Qinghua Liu · Csaba Szepesvari · Chi Jin -
2022 Poster: Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays »
Konstantin Mishchenko · Francis Bach · Mathieu Even · Blake Woodworth -
2020 Poster: Random Reshuffling: Simple Analysis with Vast Improvements »
Konstantin Mishchenko · Ahmed Khaled · Peter Richtarik -
2020 Poster: On the Theory of Transfer Learning: The Importance of Task Diversity »
Nilesh Tripuraneni · Michael Jordan · Chi Jin -
2020 Poster: Near-Optimal Reinforcement Learning with Self-Play »
Yu Bai · Chi Jin · Tiancheng Yu -
2020 Poster: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Spotlight: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Poster: On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces »
Zhuoran Yang · Chi Jin · Zhaoran Wang · Mengdi Wang · Michael Jordan -
2019 : Lunch break and poster »
Felix Sattler · Khaoula El Mekkaoui · Neta Shoham · Cheng Hong · Florian Hartmann · Boyue Li · Daliang Li · Sebastian Caldas Rivera · Jianyu Wang · Kartikeya Bhardwaj · Tribhuvanesh Orekondy · YAN KANG · Dashan Gao · Mingshu Cong · Xin Yao · Songtao Lu · JIAHUAN LUO · Shicong Cen · Peter Kairouz · Yihan Jiang · Tzu Ming Hsu · Aleksei Triastcyn · Yang Liu · Ahmed Khaled · Zhicong Liang · Boi Faltings · Seungwhan Moon · Suyi Li · Tao Fan · Tianchi Huang · Chunyan Miao · Hang Qi · Matthew Brown · Lucas Glass · Junpu Wang · Wei Chen · Radu Marculescu · tomer avidor · Xueyang Wu · Mingyi Hong · Ce Ju · John Rush · Ruixiao Zhang · Youchi ZHOU · Françoise Beaufays · Yingxuan Zhu · Lei Xia -
2019 : Spotlight talks »
Damien Scieur · Konstantin Mishchenko · Rohan Anil -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2018 Poster: SEGA: Variance Reduction via Gradient Sketching »
Filip Hanzely · Konstantin Mishchenko · Peter Richtarik