Timezone: »
Poster
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Yu Bai · Chi Jin · Song Mei · Ziang Song · Tiancheng Yu
A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs). This approach enables us to directly translate state-of-the-art techniques and analyses in NFGs to learning EFGs, but typically suffers from computational intractability due to the exponential blow-up of the game size introduced by the conversion. In this paper, we address this problem in natural and important setups for the \emph{$\Phi$-Hedge} algorithm---A generic algorithm capable of learning a large class of equilibria for NFGs. We show that $\Phi$-Hedge can be directly used to learn Nash Equilibria (zero-sum settings), Normal-Form Coarse Correlated Equilibria (NFCCE), and Extensive-Form Correlated Equilibria (EFCE) in EFGs. We prove that, in those settings, the \emph{$\Phi$-Hedge} algorithms are equivalent to standard Online Mirror Descent (OMD) algorithms for EFGs with suitable dilated regularizers, and run in polynomial time. This new connection further allows us to design and analyze a new class of OMD algorithms based on modifying its log-partition function. In particular, we design an improved algorithm with balancing techniques that achieves a sharp $\widetilde{\mathcal{O}}(\sqrt{XAT})$ EFCE-regret under bandit-feedback in an EFG with $X$ information sets, $A$ actions, and $T$ episodes. To our best knowledge, this is the first such rate and matches the information-theoretic lower bound.
Author Information
Yu Bai (Salesforce Research)
Chi Jin (Princeton University)
Song Mei (University of California, Berkeley)
Ziang Song (Stanford University)
Tiancheng Yu (MIT)
More from the Same Authors
-
2021 Spotlight: Understanding the Under-Coverage Bias in Uncertainty Estimation »
Yu Bai · Song Mei · Huan Wang · Caiming Xiong -
2022 : Achieving Diversity and Relevancy in Zero-Shot Recommender Systems for Human Evaluations »
Tiancheng Yu · Yifei Ma · Anoop Deoras -
2023 Poster: Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL »
Qinghua Liu · Gellért Weisz · András György · Chi Jin · Csaba Szepesvari -
2023 Poster: What can a Single Attention Layer Learn? A Study Through the Random Features Lens »
Hengyu Fu · Tianyu Guo · Yu Bai · Song Mei -
2023 Poster: Is RLHF More Difficult than Standard RL? »
Yuanhao Wang · Qinghua Liu · Chi Jin -
2023 Poster: Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection »
Yu Bai · Fan Chen · Huan Wang · Caiming Xiong · Song Mei -
2023 Poster: Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations »
Minshuo Chen · Yu Bai · H. Vincent Poor · Mengdi Wang -
2023 Poster: Context-lumpable stochastic bandits »
Chung-Wei Lee · Qinghua Liu · Yasin Abbasi Yadkori · Chi Jin · Tor Lattimore · Csaba Szepesvari -
2023 Poster: DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method »
Ahmed Khaled Ragab Bayoumi · Konstantin Mishchenko · Chi Jin -
2023 Oral: Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection »
Yu Bai · Fan Chen · Huan Wang · Caiming Xiong · Song Mei -
2022 Poster: Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials »
Eshaan Nichani · Yu Bai · Jason Lee -
2022 Poster: Sample-Efficient Reinforcement Learning of Partially Observable Markov Games »
Qinghua Liu · Csaba Szepesvari · Chi Jin -
2022 Poster: Policy Optimization for Markov Games: Unified Framework and Faster Convergence »
Runyu Zhang · Qinghua Liu · Huan Wang · Caiming Xiong · Na Li · Yu Bai -
2022 Poster: Learning with convolution and pooling operations in kernel methods »
Theodor Misiakiewicz · Song Mei -
2022 Poster: Sample-Efficient Learning of Correlated Equilibria in Extensive-Form Games »
Ziang Song · Song Mei · Yu Bai -
2021 Poster: Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2021 Poster: Understanding the Under-Coverage Bias in Uncertainty Estimation »
Yu Bai · Song Mei · Huan Wang · Caiming Xiong -
2021 Poster: Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning »
Tengyang Xie · Nan Jiang · Huan Wang · Caiming Xiong · Yu Bai -
2021 Poster: Near-Optimal Offline Reinforcement Learning via Double Variance Reduction »
Ming Yin · Yu Bai · Yu-Xiang Wang -
2020 Poster: On the Theory of Transfer Learning: The Importance of Task Diversity »
Nilesh Tripuraneni · Michael Jordan · Chi Jin -
2020 Poster: Near-Optimal Reinforcement Learning with Self-Play »
Yu Bai · Chi Jin · Tiancheng Yu -
2020 Poster: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Spotlight: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs »
Chi Jin · Sham Kakade · Akshay Krishnamurthy · Qinghua Liu -
2020 Poster: On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces »
Zhuoran Yang · Chi Jin · Zhaoran Wang · Mengdi Wang · Michael Jordan -
2019 Poster: Provably Efficient Q-Learning with Low Switching Cost »
Yu Bai · Tengyang Xie · Nan Jiang · Yu-Xiang Wang