Timezone: »
Spotlight Poster
Regret Matching+: (In)Stability and Fast Convergence in Games
Gabriele Farina · Julien Grand-Clément · Christian Kroer · Chung-Wei Lee · Haipeng Luo
Regret Matching$^+$ (RM$^+$) and its variants are important algorithms for solving large-scale games.However, a theoretical understanding of their success in practice is still a mystery.Moreover, recent advances on fast convergence in games are limited to no-regret algorithms such as online mirror descent, which satisfy stability.In this paper, we first give counterexamples showing that RM+ and its predictive version can be unstable, which might cause other players to suffer large regret. We then provide two fixes: restarting and chopping off the positive orthant that RM$^+$ works in.We show that these fixes are sufficient to get $O(T^{1/4})$ individual regret and $O(1)$ social regret in normal-form games via RM$^+$ with predictions.We also apply our stabilizing techniques to clairvoyant updates in the uncoupled learning setting for RM$^+$ and prove desirable results akin to recent works for Clairvoyant online mirror descent. Our experiments show the advantages of our algorithms over vanilla RM$^+$-based algorithms in matrix and extensive-form games.
Author Information
Gabriele Farina (MIT)
Julien Grand-Clément (HEC Paris / Hi!Paris)
I am an Assistant Professor in the Information Systems and Operations Management department at HEC Paris and a Hi! Paris chair holder. My research focuses on data-driven decision making under parameter uncertainty, with applications in healthcare and algorithmic game theory. I received my Ph.D. in Operations Research from Columbia University in 2021 where my advisors were Prof. Vineet Goyal and Prof. Carri Chan. Before Columbia, I completed my master and undergraduate studies at Ecole polytechnique (France) in 2016, with a major in algorithms and optimization.
Christian Kroer (Columbia University)
Chung-Wei Lee (University of Southern California)
Haipeng Luo (University of Southern California)
More from the Same Authors
-
2022 : Clairvoyant Regret Minimization: Equivalence with Nemirovski’s Conceptual Prox Method and Extension to General Convex Games »
Gabriele Farina · Christian Kroer · Chung-Wei Lee · Haipeng Luo -
2022 : A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games »
Samuel Sokota · Ryan D'Orazio · J. Zico Kolter · Nicolas Loizou · Marc Lanctot · Ioannis Mitliagkas · Noam Brown · Christian Kroer -
2022 : Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning »
Anton Bakhtin · David Wu · Adam Lerer · Jonathan Gray · Athul Jacob · Gabriele Farina · Alexander Miller · Noam Brown -
2023 : Average-Constrained Policy Optimization »
Akhil Agnihotri · Rahul Jain · Haipeng Luo -
2023 : Efficient Learning in Polyhedral Games via Best Response Oracles »
Darshan Chakrabarti · Gabriele Farina · Christian Kroer -
2023 : The Consensus Game: Language Model Generation via Equilibrium Search »
Athul Jacob · Yikang Shen · Gabriele Farina · Jacob Andreas -
2023 : The Consensus Game: Language Model Generation via Equilibrium Search »
Athul Jacob · Yikang Shen · Gabriele Farina · Jacob Andreas -
2023 Poster: Computing Optimal Equilibria and Mechanisms via Learning in Zero-Sum Extensive-Form Games »
Brian Zhang · Gabriele Farina · Ioannis Anagnostides · Federico Cacciamani · Stephen McAleer · Andreas Haupt · Andrea Celli · Nicola Gatti · Vincent Conitzer · Tuomas Sandholm -
2023 Poster: Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount Factor »
Julien Grand-Clément · Marek Petrik -
2023 Poster: Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning »
Stephen McAleer · Gabriele Farina · Gaoyue Zhou · Mingzhi Wang · Yaodong Yang · Tuomas Sandholm -
2023 Poster: Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games »
Gabriele Farina · Charilaos Pipis -
2023 Poster: Context-lumpable stochastic bandits »
Chung-Wei Lee · Qinghua Liu · Yasin Abbasi Yadkori · Chi Jin · Tor Lattimore · Csaba Szepesvari -
2023 Poster: Block-Coordinate Methods and Restarting for Solving Extensive-Form Games »
Darshan Chakrabarti · Jelena Diakonikolas · Christian Kroer -
2023 Poster: Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback »
Yang Cai · Haipeng Luo · Chen-Yu Wei · Weiqiang Zheng -
2023 Poster: On the Convergence of No-Regret Learning Dynamics in Time-Varying Games »
Ioannis Anagnostides · Ioannis Panageas · Gabriele Farina · Tuomas Sandholm -
2023 Poster: No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions »
Tiancheng Jin · Junyan Liu · Chloé Rouyer · William Chang · Chen-Yu Wei · Haipeng Luo -
2023 Poster: Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms »
Tiancheng Jin · Junyan Liu · Haipeng Luo -
2023 Poster: Practical Contextual Bandits with Feedback Graphs »
Mengxiao Zhang · Yuheng Zhang · Olga Vrousgou · Haipeng Luo · Paul Mineiro -
2022 : ESCHER: ESCHEWING IMPORTANCE SAMPLING IN GAMES BY COMPUTING A HISTORY VALUE FUNCTION TO ESTIMATE REGRET »
Stephen McAleer · Gabriele Farina · Marc Lanctot · Tuomas Sandholm -
2022 Spotlight: Lightning Talks 4A-2 »
Barakeel Fanseu Kamhoua · Hualin Zhang · Taiki Miyagawa · Tomoya Murata · Xin Lyu · Yan Dai · Elena Grigorescu · Zhipeng Tu · Lijun Zhang · Taiji Suzuki · Wei Jiang · Haipeng Luo · Lin Zhang · Xi Wang · Young-San Lin · Huan Xiong · Liyu Chen · Bin Gu · Jinfeng Yi · Yongqiang Chen · Sandeep Silwal · Yiguang Hong · Maoyuan 'Raymond' Song · Lei Wang · Tianbao Yang · Han Yang · MA Kaili · Samson Zhou · Deming Yuan · Bo Han · Guodong Shi · Bo Li · James Cheng -
2022 Spotlight: Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback »
Yan Dai · Haipeng Luo · Liyu Chen -
2022 : Poster Session 2 »
Jinwuk Seok · Bo Liu · Ryotaro Mitsuboshi · David Martinez-Rubio · Weiqiang Zheng · Ilgee Hong · Chen Fan · Kazusato Oko · Bo Tang · Miao Cheng · Aaron Defazio · Tim G. J. Rudner · Gabriele Farina · Vishwak Srinivasan · Ruichen Jiang · Peng Wang · Jane Lee · Nathan Wycoff · Nikhil Ghosh · Yinbin Han · David Mueller · Liu Yang · Amrutha Varshini Ramesh · Siqi Zhang · Kaifeng Lyu · David Yunis · Kumar Kshitij Patel · Fangshuo Liao · Dmitrii Avdiukhin · Xiang Li · Sattar Vakili · Jiaxin Shi -
2022 Poster: Nonstationary Dual Averaging and Online Fair Allocation »
Luofeng Liao · Yuan Gao · Christian Kroer -
2022 Poster: Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments »
Liyu Chen · Haipeng Luo -
2022 Poster: Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games »
Ioannis Anagnostides · Gabriele Farina · Christian Kroer · Chung-Wei Lee · Haipeng Luo · Tuomas Sandholm -
2022 Poster: Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback »
Tiancheng Jin · Tal Lancewicki · Haipeng Luo · Yishay Mansour · Aviv Rosenberg -
2022 Poster: Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback »
Yan Dai · Haipeng Luo · Liyu Chen -
2022 Poster: Optimal Efficiency-Envy Trade-Off via Optimal Transport »
Steven Yin · Christian Kroer -
2022 Poster: Optimistic Mirror Descent Either Converges to Nash or to Strong Coarse Correlated Equilibria in Bimatrix Games »
Ioannis Anagnostides · Gabriele Farina · Ioannis Panageas · Tuomas Sandholm -
2022 Poster: Subgame Solving in Adversarial Team Games »
Brian Zhang · Luca Carminati · Federico Cacciamani · Gabriele Farina · Pierriccardo Olivieri · Nicola Gatti · Tuomas Sandholm -
2022 Poster: Near-Optimal No-Regret Learning Dynamics for General Convex Games »
Gabriele Farina · Ioannis Anagnostides · Haipeng Luo · Chung-Wei Lee · Christian Kroer · Tuomas Sandholm -
2022 Expo Demonstration: Human Modeling and Strategic Reasoning in the Game of Diplomacy »
Noam Brown · Alexander Miller · Gabriele Farina -
2021 Poster: The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition »
Tiancheng Jin · Longbo Huang · Haipeng Luo -
2021 Poster: Last-iterate Convergence in Extensive-Form Games »
Chung-Wei Lee · Christian Kroer · Haipeng Luo -
2021 Poster: Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium »
Gabriele Farina · Tuomas Sandholm -
2021 Poster: Online Market Equilibrium with Application to Fair Division »
Yuan Gao · Alex Peysakhovich · Christian Kroer -
2021 Poster: Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path »
Liyu Chen · Mehdi Jafarnia-Jahromi · Rahul Jain · Haipeng Luo -
2021 Poster: Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving »
Julien Grand-Clément · Christian Kroer -
2021 Poster: Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses »
Haipeng Luo · Chen-Yu Wei · Chung-Wei Lee -
2021 Oral: The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition »
Tiancheng Jin · Longbo Huang · Haipeng Luo -
2020 Poster: Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs »
Chung-Wei Lee · Haipeng Luo · Chen-Yu Wei · Mengxiao Zhang -
2020 Poster: Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition »
Tiancheng Jin · Haipeng Luo -
2020 Poster: Evaluating and Rewarding Teamwork Using Cooperative Game Abstractions »
Tom Yan · Christian Kroer · Alexander Peysakhovich -
2020 Spotlight: Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition »
Tiancheng Jin · Haipeng Luo -
2020 Oral: Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs »
Chung-Wei Lee · Haipeng Luo · Chen-Yu Wei · Mengxiao Zhang -
2020 Poster: First-Order Methods for Large-Scale Market Equilibrium Computation »
Yuan Gao · Christian Kroer -
2020 Poster: Polynomial-Time Computation of Optimal Correlated Equilibria in Two-Player Extensive-Form Games with Public Chance Moves and Beyond »
Gabriele Farina · Tuomas Sandholm -
2020 Poster: No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium »
Andrea Celli · Alberto Marchesi · Gabriele Farina · Nicola Gatti -
2020 Poster: Comparator-Adaptive Convex Bandits »
Dirk van der Hoeven · Ashok Cutkosky · Haipeng Luo -
2020 Oral: No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium »
Andrea Celli · Alberto Marchesi · Gabriele Farina · Nicola Gatti -
2019 Poster: Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks »
Gabriele Farina · Chun Kai Ling · Fei Fang · Tuomas Sandholm -
2019 Poster: Equipping Experts/Bandits with Long-term Memory »
Kai Zheng · Haipeng Luo · Ilias Diakonikolas · Liwei Wang -
2019 Poster: Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium »
Gabriele Farina · Chun Kai Ling · Fei Fang · Tuomas Sandholm -
2019 Poster: Model Selection for Contextual Bandits »
Dylan Foster · Akshay Krishnamurthy · Haipeng Luo -
2019 Spotlight: Model Selection for Contextual Bandits »
Dylan Foster · Akshay Krishnamurthy · Haipeng Luo -
2019 Spotlight: Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium »
Gabriele Farina · Chun Kai Ling · Fei Fang · Tuomas Sandholm -
2019 Poster: Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions »
Gabriele Farina · Christian Kroer · Tuomas Sandholm -
2019 Poster: Robust Multi-agent Counterfactual Prediction »
Alexander Peysakhovich · Christian Kroer · Adam Lerer -
2019 Poster: Hypothesis Set Stability and Generalization »
Dylan Foster · Spencer Greenberg · Satyen Kale · Haipeng Luo · Mehryar Mohri · Karthik Sridharan -
2018 : Regret Decomposition in Sequential Games with Convex Action Spaces and Losses »
Gabriele Farina -
2018 Poster: Solving Large Sequential Games with the Excessive Gap Technique »
Christian Kroer · Gabriele Farina · Tuomas Sandholm -
2018 Poster: Practical exact algorithm for trembling-hand equilibrium refinements in games »
Gabriele Farina · Nicola Gatti · Tuomas Sandholm -
2018 Spotlight: Solving Large Sequential Games with the Excessive Gap Technique »
Christian Kroer · Gabriele Farina · Tuomas Sandholm -
2018 Poster: Ex ante coordination and collusion in zero-sum multi-player extensive-form games »
Gabriele Farina · Andrea Celli · Nicola Gatti · Tuomas Sandholm