Timezone: »
In many real-world cooperative multiagent reinforcement learning (MARL) tasks, teams of agents can rehearse together before deployment, but then communication constraints may force individual agents to execute independently when deployed. Centralized training and decentralized execution (CTDE) is increasingly popular in recent years, focusing mainly on this setting. In the value-based MARL branch, credit assignment mechanism is typically used to factorize the team reward into each individual’s reward — individual-global-max (IGM) is a condition on the factorization ensuring that agents’ action choices coincide with team’s optimal joint action. However, current architectures fail to consider local coordination within sub-teams that should be exploited for more effective factorization, leading to faster learning. We propose a novel value factorization framework, called multiagent Q-learning with sub-team coordination (QSCAN), to flexibly represent sub-team coordination while honoring the IGM condition. QSCAN encompasses the full spectrum of sub-team coordination according to sub-team size, ranging from the monotonic value function class to the entire IGM function class, with familiar methods such as QMIX and QPLEX located at the respective extremes of the spectrum. Experimental results show that QSCAN’s performance dominates state-of-the-art methods in matrix games, predator-prey tasks, the Switch challenge in MA-Gym. Additionally, QSCAN achieves comparable performances to those methods in a selection of StarCraft II micro-management tasks.
Author Information
Wenhan Huang (Shanghai Jiao Tong University)
Kai Li (Huawei Noah's Ark Lab)
Kun Shao (Huawei Noah's Ark Lab)
Tianze Zhou (Beijing Institute of Technology)
Matthew Taylor (U. of Alberta)
Jun Luo (Huawei Technologies Ltd.)
Dongge Wang (Swiss Federal Institute of Technology Lausanne)
Hangyu Mao (Huawei Technologies Co., Ltd.)
Jianye Hao (Tianjin University)
Jun Wang (UCL)
Xiaotie Deng (Peking University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Multiagent Q-learning with Sub-Team Coordination »
Dates n/a. Room
More from the Same Authors
-
2021 : Nash Convergence of Mean-Based Learning Algorithms in First Price Auctions »
Xiaotie Deng · Xinyan Hu · Tao Lin · Weiqiang Zheng -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2021 : Nash Convergence of Mean-Based Learning Algorithms in First Price Auctions »
Xiaotie Deng · Xinyan Hu · Tao Lin · Weiqiang Zheng -
2022 Poster: M2N: Mesh Movement Networks for PDE Solvers »
Wenbin Song · Mingrui Zhang · Joseph G Wallwork · Junpeng Gao · Zheng Tian · Fanglei Sun · Matthew Piggott · Junqing Chen · Zuoqiang Shi · Xiang Chen · Jun Wang -
2022 Poster: Plan To Predict: Learning an Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning »
Zifan Wu · Chao Yu · Chen Chen · Jianye Hao · Hankz Hankui Zhuo -
2022 Poster: Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing »
Yaodong Yang · Guangyong Chen · Weixun Wang · Xiaotian Hao · Jianye Hao · Pheng-Ann Heng -
2022 Poster: Versatile Multi-stage Graph Neural Network for Circuit Representation »
shuwen yang · Zhihao Yang · Dong Li · Yingxueff Zhang · Zhanguang Zhang · Guojie Song · Jianye Hao -
2022 : Build generally reusable agent-environment interaction models »
Jun Jin · Hongming Zhang · Jun Luo -
2022 : Contextual Transformer for Offline Meta Reinforcement Learning »
Runji Lin · Ye Li · Xidong Feng · Zhaowei Zhang · XIAN HONG WU FUNG · Haifeng Zhang · Jun Wang · Yali Du · Yaodong Yang -
2022 : Fifteen-minute Competition Overview Video »
Tianpei Yang · Iuliia Kotseruba · Montgomery Alban · Amir Rasouli · Soheil Mohamad Alizadeh Shabestary · Randolph Goebel · Matthew Taylor · Liam Paull · Florian Shkurti -
2022 : Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes »
Min Zhang · Hongyao Tang · Jianye Hao · YAN ZHENG -
2022 : EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model »
Yifu Yuan · Jianye Hao · Fei Ni · Yao Mu · YAN ZHENG · Yujing Hu · Jinyi Liu · Yingfeng Chen · Changjie Fan -
2022 : ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation »
Pengyi Li · Hongyao Tang · Jianye Hao · YAN ZHENG · Xian Fu · Zhaopeng Meng -
2022 : Do As You Teach: A Multi-Teacher Approach to Self-Play in Deep Reinforcement Learning »
Chaitanya Kharyal · Tanmay Sinha · Vijaya Sai Krishna Gottipati · Srijita Das · Matthew Taylor -
2022 : Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents »
Minghuan Liu · Zhengbang Zhu · Menghui Zhu · Yuzheng Zhuang · Weinan Zhang · Jianye Hao -
2022 Workshop: Deep Reinforcement Learning Workshop »
Karol Hausman · Qi Zhang · Matthew Taylor · Martha White · Suraj Nair · Manan Tomar · Risto Vuorio · Ted Xiao · Zeyu Zheng · Manan Tomar -
2022 Spotlight: Lightning Talks 5A-3 »
Minting Pan · Xiang Chen · Wenhan Huang · Can Chang · Zhecheng Yuan · Jianzhun Shao · Yushi Cao · Peihao Chen · Ke Xue · Zhengrong Xue · Zhiqiang Lou · Xiangming Zhu · Lei Li · Zhiming Li · Kai Li · Jiacheng Xu · Dongyu Ji · Ni Mu · Kun Shao · Tianpei Yang · Kunyang Lin · Ningyu Zhang · Yunbo Wang · Lei Yuan · Bo Yuan · Hongchang Zhang · Jiajun Wu · Tianze Zhou · Xueqian Wang · Ling Pan · Yuhang Jiang · Xiaokang Yang · Xiaozhuan Liang · Hao Zhang · Weiwen Hu · Miqing Li · YAN ZHENG · Matthew Taylor · Huazhe Xu · Shumin Deng · Chao Qian · YI WU · Shuncheng He · Wenbing Huang · Chuanqi Tan · Zongzhang Zhang · Yang Gao · Jun Luo · Yi Li · Xiangyang Ji · Thomas Li · Mingkui Tan · Fei Huang · Yang Yu · Huazhe Xu · Dongge Wang · Jianye Hao · Chuang Gan · Yang Liu · Luo Si · Hangyu Mao · Huajun Chen · Jianye Hao · Jun Wang · Xiaotie Deng -
2022 Spotlight: Plan To Predict: Learning an Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning »
Zifan Wu · Chao Yu · Chen Chen · Jianye Hao · Hankz Hankui Zhuo -
2022 Spotlight: DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning »
Yao Mu · Yuzheng Zhuang · Fei Ni · Bin Wang · Jianyu Chen · Jianye Hao · Ping Luo -
2022 Spotlight: GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis »
Yushi Cao · Zhiming Li · Tianpei Yang · Hao Zhang · YAN ZHENG · Yi Li · Jianye Hao · Yang Liu -
2022 Spotlight: Lightning Talks 5A-1 »
Yao Mu · Jin Zhang · Haoyi Niu · Rui Yang · Mingdong Wu · Ze Gong · Shubham Sharma · Chenjia Bai · Yu ("Tony") Zhang · Siyuan Li · Yuzheng Zhuang · Fangwei Zhong · Yiwen Qiu · Xiaoteng Ma · Fei Ni · Yulong Xia · Chongjie Zhang · Hao Dong · Ming Li · Zhaoran Wang · Bin Wang · Chongjie Zhang · Jianyu Chen · Guyue Zhou · Lei Han · Jianming HU · Jianye Hao · Xianyuan Zhan · Ping Luo -
2022 Spotlight: Lightning Talks 3A-2 »
shuwen yang · Xu Zhang · Delvin Ce Zhang · Lan-Zhe Guo · Renzhe Xu · Zhuoer Xu · Yao-Xiang Ding · Weihan Li · Xingxuan Zhang · Xi-Zhu Wu · Zhenyuan Yuan · Hady Lauw · Yu Qi · Yi-Ge Zhang · Zhihao Yang · Guanghui Zhu · Dong Li · Changhua Meng · Kun Zhou · Gang Pan · Zhi-Fan Wu · Bo Li · Minghui Zhu · Zhi-Hua Zhou · Yafeng Zhang · Yingxueff Zhang · shiwen cui · Jie-Jing Shao · Zhanguang Zhang · Zhenzhe Ying · Xiaolong Chen · Yu-Feng Li · Guojie Song · Peng Cui · Weiqiang Wang · Ming GU · Jianye Hao · Yihua Huang -
2022 Spotlight: Versatile Multi-stage Graph Neural Network for Circuit Representation »
shuwen yang · Zhihao Yang · Dong Li · Yingxueff Zhang · Zhanguang Zhang · Guojie Song · Jianye Hao -
2022 Spotlight: Optimistic Tree Searches for Combinatorial Black-Box Optimization »
Cedric Malherbe · Antoine Grosnit · Rasul Tutunov · Haitham Bou Ammar · Jun Wang -
2022 Competition: Driving SMARTS »
Amir Rasouli · Matthew Taylor · Iuliia Kotseruba · Tianpei Yang · Randolph Goebel · Soheil Mohamad Alizadeh Shabestary · Montgomery Alban · Florian Shkurti · Liam Paull -
2022 Workshop: Reinforcement Learning for Real Life (RL4RealLife) Workshop »
Yuxi Li · Emma Brunskill · MINMIN CHEN · Omer Gottesman · Lihong Li · Yao Liu · Zhiwei Tony Qin · Matthew Taylor -
2022 Poster: A Simple Decentralized Cross-Entropy Method »
Zichen Zhang · Jun Jin · Martin Jagersand · Jun Luo · Dale Schuurmans -
2022 Poster: Optimistic Tree Searches for Combinatorial Black-Box Optimization »
Cedric Malherbe · Antoine Grosnit · Rasul Tutunov · Haitham Bou Ammar · Jun Wang -
2022 Poster: GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis »
Yushi Cao · Zhiming Li · Tianpei Yang · Hao Zhang · YAN ZHENG · Yi Li · Jianye Hao · Yang Liu -
2022 Poster: DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning »
Yao Mu · Yuzheng Zhuang · Fei Ni · Bin Wang · Jianyu Chen · Jianye Hao · Ping Luo -
2022 Poster: Enhancing Safe Exploration Using Safety State Augmentation »
Aivar Sootla · Alexander Cowen-Rivers · Jun Wang · Haitham Bou Ammar -
2022 Poster: The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design »
Ruoyu Cheng · Xianglong Lyu · Yang Li · Junjie Ye · Jianye Hao · Junchi Yan -
2022 Poster: Multi-Agent Reinforcement Learning is a Sequence Modeling Problem »
Muning Wen · Jakub Kuba · Runji Lin · Weinan Zhang · Ying Wen · Jun Wang · Yaodong Yang -
2022 Poster: A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning »
Bo Liu · Xidong Feng · Jie Ren · Luo Mai · Rui Zhu · Haifeng Zhang · Jun Wang · Yaodong Yang -
2021 : Reward and State Design: Towards Learning to Teach »
Alex Lewandowski · Calarina Muslimani · Matthew Taylor · Jun Luo -
2021 : Learning Representations for Pixel-based Control: What Matters and Why? »
Manan Tomar · Utkarsh A Mishra · Amy Zhang · Matthew Taylor -
2021 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · David Silver · Matthew Taylor · Martha White · Srijita Das · Yuqing Du · Andrew Patterson · Manan Tomar · Olivia Watkins -
2020 : Contributed Talk: Maximum Reward Formulation In Reinforcement Learning »
Vijaya Sai Krishna Gottipati · Yashaswi Pathak · Rohan Nuttall · Sahir . · Raviteja Chunduru · Ahmed Touati · Sriram Ganapathi · Matthew Taylor · Sarath Chandar -
2020 Poster: Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping »
Yujing Hu · Weixun Wang · Hangtian Jia · Yixiang Wang · Yingfeng Chen · Jianye Hao · Feng Wu · Changjie Fan -
2020 Poster: A Game-Theoretic Analysis of the Empirical Revenue Maximization Algorithm with Endogenous Sampling »
Xiaotie Deng · Ron Lavi · Tao Lin · Qi Qi · Wenwei WANG · Xiang Yan -
2018 Poster: A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents »
YAN ZHENG · Zhaopeng Meng · Jianye Hao · Zongzhang Zhang · Tianpei Yang · Changjie Fan