Timezone: »
Poster
Finding Correlated Equilibrium of Constrained Markov Game: A PrimalDual Approach
Ziyi Chen · Shaocong Ma · Yi Zhou
Constrained Markov game is a fundamental problem that covers many applications, where multiple players compete with each other under behavioral constraints. The existing literature has proved the existence of Nash equilibrium for constrained Markov games, which turns out to be PPADcomplete and cannot be computed in polynomial time. In this work, we propose a surrogate notion of correlated equilibrium (CE) for constrained Markov games that can be computed in polynomial time, and study its fundamental properties. We show that the modification structure of CE of constrained Markov games is fundamentally different from that of unconstrained Markov games. Moreover, we prove that the corresponding Lagrangian function has zero duality gap. Based on this result, we develop the first primaldual algorithm that provably converges to CE of constrained Markov games. In particular, we prove that both the duality gap and the constraint violation of the output policy converge at the rate $\mathcal{O}(\frac{1}{\sqrt{T}})$. Moreover, when adopting the Vlearning algorithm as the subroutine in the primal update, our algorithm achieves an approximate CE with $\epsilon$ duality gap with the sample complexity $\mathcal{O}(H^9\mathcal{S}\mathcal{A}^{2} \epsilon^{4})$.
Author Information
Ziyi Chen (University of Utah)
Shaocong Ma (University of Utah)
Yi Zhou (University of Utah)
More from the Same Authors

2021 Poster: NonAsymptotic Analysis for Two Timescale TDC with General Smooth Function Approximation »
Yue Wang · Shaofeng Zou · Yi Zhou 
2020 Poster: A Statistical Mechanics Framework for TaskAgnostic Sample Design in Machine Learning »
Bhavya Kailkhura · Jayaraman Thiagarajan · Qunwei Li · Jize Zhang · Yi Zhou · Timo Bremer 
2020 Poster: VarianceReduced OffPolicy TDC Learning: NonAsymptotic Convergence Analysis »
Shaocong Ma · Yi Zhou · Shaofeng Zou 
2019 Poster: SpiderBoost and Momentum: Faster Variance Reduction Algorithms »
Zhe Wang · Kaiyi Ji · Yi Zhou · Yingbin Liang · Vahid Tarokh