Timezone: »
The present contribution deals with decentralized policy evaluation in multi-agent Markov decision processes using temporal-difference (TD) methods with linear function approximation for scalability. The agents cooperate to estimate the value function of such a process by observing continual state transitions of a shared environment over the graph of interconnected nodes (agents), along with locally private rewards. Different from existing consensus-type TD algorithms, the approach here develops a simple decentralized TD tracker by wedding TD learning with gradient tracking techniques. The non-asymptotic properties of the novel TD tracker are established for both independent and identically distributed (i.i.d.) as well as Markovian transitions through a unifying multistep Lyapunov analysis. In contrast to the prior art, the novel algorithm forgoes the limiting error bounds on the number of agents, which endows it with performance comparable to that of centralized TD methods that are the sharpest known to date.
Author Information
Gang Wang (Beijing Institute of Technology)
Songtao Lu (IBM Research)
Georgios Giannakis (University of Minnesota)
Gerald Tesauro (IBM TJ Watson Research Center)
Jian Sun (Beijing Insitute of Technology)
More from the Same Authors
-
2022 : SCERL: A Benchmark for intersecting language and safe reinforcement learning »
Lan Hoang · Shivam Ratnakar · Nicolas Galichet · Akifumi Wachi · Keerthiram Murugesan · Songtao Lu · Mattia Atzeni · Michael Katz · Subhajit Chaudhury -
2022 : Learning in Factored Domains with Information-Constrained Visual Representations »
Tyler Malloy · Chris Sims · Tim Klinger · Matthew Riemer · Miao Liu · Gerald Tesauro -
2023 Poster: An Alternating Optimization Method for Bilevel Problems under the Polyak-Ćojasiewicz Condition »
Quan Xiao · Songtao Lu · Tianyi Chen -
2023 Poster: STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning »
Weipu Zhang · Gang Wang · Jian Sun · Yetian Yuan · Gao Huang -
2023 Poster: Enhancing Sharpness-Aware Optimization Through Variance Suppression »
Bingcong Li · Georgios Giannakis -
2023 Poster: SLM: A Smoothed First-order Lagrangian Method for Structured Constrained Nonconvex Minimization »
Songtao Lu · Jiawei Zhang -
2023 Poster: On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration »
Shuai Zhang · Meng Wang · Hongkang Li · Miao Liu · Pin-Yu Chen · Songtao Lu · Sijia Liu · Keerthiram Murugesan · Subhajit Chaudhury -
2022 : Conditional Moment Alignment for Improved Generalization in Federated Learning »
Jayanth Reddy Regatti · Songtao Lu · Abhishek Gupta · Ness Shroff -
2022 Poster: A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization »
Songtao Lu · Siliang Zeng · Xiaodong Cui · Mark Squillante · Lior Horesh · Brian Kingsbury · Jia Liu · Mingyi Hong -
2022 Poster: Understanding Benign Overfitting in Gradient-Based Meta Learning »
Lisha Chen · Songtao Lu · Tianyi Chen -
2022 Poster: Influencing Long-Term Behavior in Multiagent Reinforcement Learning »
Dong-Ki Kim · Matthew Riemer · Miao Liu · Jakob Foerster · Michael Everett · Chuangchuang Sun · Gerald Tesauro · Jonathan How -
2021 Poster: Heavy Ball Momentum for Conditional Gradient »
Bingcong Li · Alireza Sadeghi · Georgios Giannakis -
2021 Poster: Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning »
Xin Zhang · Zhuqing Liu · Jia Liu · Zhengyuan Zhu · Songtao Lu -
2020 Poster: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2020 Poster: ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training »
Chia-Yu Chen · Jiamin Ni · Songtao Lu · Xiaodong Cui · Pin-Yu Chen · Xiao Sun · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Wei Zhang · Kailash Gopalakrishnan -
2020 Spotlight: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2019 Poster: Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients »
Jun Sun · Tianyi Chen · Georgios Giannakis · Zaiyue Yang -
2018 Poster: LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning »
Tianyi Chen · Georgios Giannakis · Tao Sun · Wotao Yin -
2018 Spotlight: LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning »
Tianyi Chen · Georgios Giannakis · Tao Sun · Wotao Yin -
2018 Poster: Learning Abstract Options »
Matthew Riemer · Miao Liu · Gerald Tesauro -
2018 Poster: Dialog-based Interactive Image Retrieval »
Xiaoxiao Guo · Hui Wu · Yu Cheng · Steven Rennie · Gerald Tesauro · Rogerio Feris -
2017 Workshop: Conversational AI - today's practice and tomorrow's potential »
Alborz Geramifard · Jason Williams · Larry Heck · Jim Glass · Antoine Bordes · Steve Young · Gerald Tesauro -
2017 Poster: Solving Most Systems of Random Quadratic Equations »
Gang Wang · Georgios Giannakis · Yousef Saad · Jie Chen -
2016 Poster: Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow »
Gang Wang · Georgios Giannakis -
2015 : Deep RL in Games Research »
Gerald Tesauro -
2007 Spotlight: Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning »
Gerald Tesauro · Rajarshi Das · Hoi Chan · Jeffrey O Kephart · David Levine · Freeman Rawson · Charles Lefurgy -
2007 Poster: Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning »
Gerald Tesauro · Rajarshi Das · Hoi Chan · Jeffrey O Kephart · David Levine · Freeman Rawson · Charles Lefurgy