Timezone: »
Poster
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning
Chao Qu · Shie Mannor · Huan Xu · Yuan Qi · Le Song · Junwu Xiong
Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C #200
We consider the networked multi-agent reinforcement learning (MARL) problem in a fully decentralized setting, where agents learn to coordinate to achieve joint success. This problem is widely encountered in many areas including traffic control, distributed control, and smart grids.
We assume each agent is located at a node of a communication network and can exchange information only with its neighbors. Using softmax temporal consistency, we derive a primal-dual decentralized optimization method and obtain a principled and data-efficient iterative algorithm named {\em value propagation}. We prove a non-asymptotic convergence rate of $\mathcal{O}(1/T)$ with nonlinear function approximation. To the best of our knowledge, it is the first MARL algorithm with a convergence guarantee in the control, off-policy, non-linear function approximation, fully decentralized setting.
Author Information
Chao Qu (Ant Financial Services Group)
Shie Mannor (Technion)
Huan Xu (Georgia Inst. of Technology)
Yuan Qi (Ant Financial Services Group)
Le Song (Ant Financial Services Group)
Junwu Xiong (Ant Financial Services Group)
More from the Same Authors
-
2021 : Bandits with Partially Observable Confounded Data »
Guy Tennenholtz · Uri Shalit · Shie Mannor · Yonathan Efroni -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Guy Tennenholtz · Assaf Hallak · Gal Dalal · Shie Mannor · Gal Chechik · Uri Shalit -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Guy Tennenholtz · Assaf Hallak · Gal Dalal · Shie Mannor · Gal Chechik · Uri Shalit -
2021 : Latent Geodesics of Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Nir Baram · Shie Mannor -
2022 : Digital Human Interactive Recommendation Decision-Making Based on Reinforcement Learning »
Junwu Xiong -
2020 : Mini-panel discussion 2 - Real World RL: An industry perspective »
Franziska Meier · Gabriel Dulac-Arnold · Shie Mannor · Timothy A Mann -
2020 Workshop: The Challenges of Real World Reinforcement Learning »
Daniel Mankowitz · Gabriel Dulac-Arnold · Shie Mannor · Omer Gottesman · Anusha Nagabandi · Doina Precup · Timothy A Mann · Gabriel Dulac-Arnold -
2020 Poster: Bandit Samplers for Training Graph Neural Networks »
Ziqi Liu · Zhengwei Wu · Zhiqiang Zhang · Jun Zhou · Shuang Yang · Le Song · Yuan Qi -
2020 Poster: Online Planning with Lookahead Policies »
Yonathan Efroni · Mohammad Ghavamzadeh · Shie Mannor -
2019 : Invited Talk by Yuan (Alan) Qi (Ant Financial) »
Yuan Qi -
2019 Poster: Distributional Policy Optimization: An Alternative Approach for Continuous Control »
Chen Tessler · Guy Tennenholtz · Shie Mannor -
2019 Poster: Large Scale Markov Decision Processes with Changing Rewards »
Adrian Rivera Cardoso · He Wang · Huan Xu -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Hierarchical RL: From Prior Knowledge to Policies (Shie Mannor) »
Shie Mannor -
2018 Poster: Robust Hypothesis Testing Using Wasserstein Uncertainty Sets »
Rui Gao · Liyan Xie · Yao Xie · Huan Xu -
2018 Spotlight: Robust Hypothesis Testing Using Wasserstein Uncertainty Sets »
Rui Gao · Liyan Xie · Yao Xie · Huan Xu -
2018 Poster: Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning »
Tom Zahavy · Matan Haroush · Nadav Merlis · Daniel J Mankowitz · Shie Mannor -
2017 Workshop: Hierarchical Reinforcement Learning »
Andrew G Barto · Doina Precup · Shie Mannor · Tom Schaul · Roy Fox · Carlos Florensa -
2017 Poster: Reinforcement Learning under Model Mismatch »
Aurko Roy · Huan Xu · Sebastian Pokutta