Timezone: »
In this work, we study two-player zero-sum stochastic games and develop a variant of the smoothed best-response learning dynamics that combines independent learning dynamics for matrix games with the minimax value iteration for stochastic games. The resulting learning dynamics are payoff-based, convergent, rational, and symmetric between the two players. Our theoretical results present to the best of our knowledge the first last-iterate finite-sample analysis of such independent learning dynamics. To establish the results, we develop a coupled Lyapunov drift approach to capture the evolution of multiple sets of coupled and stochastic iterates, which might be of independent interest.
Author Information
Zaiwei Chen (California Institute of Technology)
Kaiqing Zhang (University of Maryland, College Park)
Eric Mazumdar (California Institute of Technology)
Asuman Ozdaglar (Massachusetts Institute of Technology)
Asu Ozdaglar received the B.S. degree in electrical engineering from the Middle East Technical University, Ankara, Turkey, in 1996, and the S.M. and the Ph.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, in 1998 and 2003, respectively. She is currently a professor in the Electrical Engineering and Computer Science Department at the Massachusetts Institute of Technology. She is also the director of the Laboratory for Information and Decision Systems. Her research expertise includes optimization theory, with emphasis on nonlinear programming and convex analysis, game theory, with applications in communication, social, and economic networks, distributed optimization and control, and network analysis with special emphasis on contagious processes, systemic risk and dynamic control. Professor Ozdaglar is the recipient of a Microsoft fellowship, the MIT Graduate Student Council Teaching award, the NSF Career award, the 2008 Donald P. Eckman award of the American Automatic Control Council, the Class of 1943 Career Development Chair, the inaugural Steven and Renee Innovation Fellowship, and the 2014 Spira teaching award. She served on the Board of Governors of the Control System Society in 2010 and was an associate editor for IEEE Transactions on Automatic Control. She is currently the area co-editor for a new area for the journal Operations Research, entitled "Games, Information and Networks. She is the co-author of the book entitled âConvex Analysis and Optimizationâ (Athena Scientific, 2003).
Adam Wierman (Caltech)
More from the Same Authors
-
2021 Spotlight: Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems »
Yiheng Lin · Yang Hu · Guanya Shi · Haoyuan Sun · Guannan Qu · Adam Wierman -
2022 : Robustifying machine-learned algorithms for efficient grid operation »
Nicolas Christianson · Christopher Yeh · Tongxin Li · Mahdi Torabi Rad · Azarang Golmohammadi · Adam Wierman -
2022 : Stability Constrained Reinforcement Learning for Real-Time Voltage Control »
Jie Feng · Yuanyuan Shi · Guannan Qu · Steven Low · Anima Anandkumar · Adam Wierman -
2022 : SustainGym: A Benchmark Suite of Reinforcement Learning for Sustainability Applications »
Christopher Yeh · Victor Li · Rajeev Datta · Yisong Yue · Adam Wierman -
2022 : Smoothed-SGDmax: A Stability-Inspired Algorithm to Improve Adversarial Generalization »
Jiancong Xiao · Jiawei Zhang · Zhiquan Luo · Asuman Ozdaglar -
2023 Poster: Time-Reversed Dissipation Induces Duality Between Minimizing Gradient Norm and Function Value »
Jaeyeon Kim · Asuman Ozdaglar · Chanwoo Park · Ernest Ryu -
2023 Poster: Self-Supervised Reinforcement Learning that Transfers using Random Features »
Boyuan Chen · Chuning Zhu · Pulkit Agrawal · Kaiqing Zhang · Abhishek Gupta -
2023 Poster: Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations »
Yiheng Lin · James A. Preiss · Emile Anand · Yingying Li · Yisong Yue · Adam Wierman -
2023 Poster: Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs »
Dongsheng Ding · Chen-Yu Wei · Kaiqing Zhang · Alejandro Ribeiro -
2023 Poster: Beyond Black-Box Advice: Learning-Augmented Algorithms for MDPs with Q-Value Predictions »
Tongxin Li · Yiheng Lin · Shaolei Ren · Adam Wierman -
2023 Poster: Multi-Player Zero-Sum Markov Games with Networked Separable Interactions »
Chanwoo Park · Kaiqing Zhang · Asuman Ozdaglar -
2023 Poster: Anytime-Competitive Reinforcement Learning with Policy Prior »
Jianyi Yang · Pengfei Li · Tongxin Li · Adam Wierman · Shaolei Ren -
2023 Poster: Strategic Distribution Shift of Interacting Agents via Coupled Gradient Flows »
Lauren Conger · Franca Hoffmann · Eric Mazumdar · Lillian Ratliff -
2023 Poster: Robust Learning for Smoothed Online Convex Optimization with Feedback Delay »
Pengfei Li · Jianyi Yang · Adam Wierman · Shaolei Ren -
2023 Poster: Adversarial Attacks on Online Learning to Rank with Click Feedback »
Jinhang Zuo · Zhiyao Zhang · Zhiyong Wang · Shuai Li · Mohammad Hajiesmaili · Adam Wierman -
2023 Poster: SustainGym: Reinforcement Learning Environments for Sustainable Energy Systems »
Christopher Yeh · Victor Li · Rajeev Datta · Julio Arroyo · Nicolas Christianson · Chi Zhang · Yize Chen · Mohammad Mehdi Hosseini · Azarang Golmohammadi · Yuanyuan Shi · Yisong Yue · Adam Wierman -
2022 Poster: What is a Good Metric to Study Generalization of Minimax Learners? »
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang -
2022 Poster: Bridging Central and Local Differential Privacy in Data Acquisition Mechanisms »
Alireza Fallah · Ali Makhdoumi · azarakhsh malekian · Asuman Ozdaglar -
2022 Poster: Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets »
Chinmay Maheshwari · Shankar Sastry · Eric Mazumdar -
2022 Poster: On the Sample Complexity of Stabilizing LTI Systems on a Single Trajectory »
Yang Hu · Adam Wierman · Guannan Qu -
2022 Poster: Bounded-Regret MPC via Perturbation Analysis: Prediction Error, Constraints, and Nonlinearity »
Yiheng Lin · Yang Hu · Guannan Qu · Tongxin Li · Adam Wierman -
2021 : Q&A with Professor Asu Ozdaglar »
Asuman Ozdaglar -
2021 : Keynote Talk: Personalization in Federated Learning: Adaptation and Clustering (Asu Ozdaglar) »
Asuman Ozdaglar -
2021 Poster: Multi-Agent Reinforcement Learning in Stochastic Networked Systems »
Yiheng Lin · Guannan Qu · Longbo Huang · Adam Wierman -
2021 Poster: Decentralized Q-learning in Zero-sum Markov Games »
Muhammed Sayin · Kaiqing Zhang · David Leslie · Tamer Basar · Asuman Ozdaglar -
2021 Poster: Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems »
Bo Sun · Russell Lee · Mohammad Hajiesmaili · Adam Wierman · Danny Tsang -
2021 Poster: Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks »
Alireza Fallah · Aryan Mokhtari · Asuman Ozdaglar -
2021 Poster: On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning »
Alireza Fallah · Kristian Georgiev · Aryan Mokhtari · Asuman Ozdaglar -
2021 Poster: Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems »
Yiheng Lin · Yang Hu · Guanya Shi · Haoyuan Sun · Guannan Qu · Adam Wierman -
2020 Poster: Online Optimization with Memory and Competitive Control »
Guanya Shi · Yiheng Lin · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2020 Poster: Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach »
Alireza Fallah · Aryan Mokhtari · Asuman Ozdaglar -
2020 Poster: Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward »
Guannan Qu · Yiheng Lin · Adam Wierman · Na Li -
2020 Poster: The Power of Predictions in Online Control »
Chenkai Yu · Guanya Shi · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2019 Poster: Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization »
Gautam Goel · Yiheng Lin · Haoyuan Sun · Adam Wierman -
2019 Spotlight: Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization »
Gautam Goel · Yiheng Lin · Haoyuan Sun · Adam Wierman -
2019 Poster: A Universally Optimal Multistage Accelerated Stochastic Gradient Method »
Necdet Serhat Aybat · Alireza Fallah · Mert Gurbuzbalaban · Asuman Ozdaglar -
2018 Poster: Escaping Saddle Points in Constrained Optimization »
Aryan Mokhtari · Asuman Ozdaglar · Ali Jadbabaie -
2018 Spotlight: Escaping Saddle Points in Constrained Optimization »
Aryan Mokhtari · Asuman Ozdaglar · Ali Jadbabaie -
2017 Poster: When Cyclic Coordinate Descent Outperforms Randomized Coordinate Descent »
Mert Gurbuzbalaban · Asuman Ozdaglar · Pablo A Parrilo · Nuri Vanli -
2017 Spotlight: When Cyclic Coordinate Descent Outperforms Randomized Coordinate Descent »
Mert Gurbuzbalaban · Asuman Ozdaglar · Pablo A Parrilo · Nuri Vanli -
2015 Invited Talk: Incremental Methods for Additive Cost Convex Optimization »
Asuman Ozdaglar -
2013 Poster: Computing the Stationary Distribution Locally »
Christina Lee · Asuman Ozdaglar · Devavrat Shah