Timezone: »

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning
Tianpei Yang · Weixun Wang · Hongyao Tang · Jianye Hao · Zhaopeng Meng · Hangyu Mao · Dong Li · Wulong Liu · Yingfeng Chen · Yujing Hu · Changjie Fan · Chengwei Zhang

Tue Dec 07 04:30 PM -- 06:00 PM (PST) @ None #None

Transfer Learning has shown great potential to enhance single-agent Reinforcement Learning (RL) efficiency. Similarly, Multiagent RL (MARL) can also be accelerated if agents can share knowledge with each other. However, it remains a problem of how an agent should learn from other agents. In this paper, we propose a novel Multiagent Policy Transfer Framework (MAPTF) to improve MARL efficiency. MAPTF learns which agent's policy is the best to reuse for each agent and when to terminate it by modeling multiagent policy transfer as the option learning problem. Furthermore, in practice, the option module can only collect all agent's local experiences for update due to the partial observability of the environment. While in this setting, each agent's experience may be inconsistent with each other, which may cause the inaccuracy and oscillation of the option-value's estimation. Therefore, we propose a novel option learning algorithm, the successor representation option learning to solve it by decoupling the environment dynamics from rewards and learning the option-value under each agent's preference. MAPTF can be easily combined with existing deep RL and MARL approaches, and experimental results show it significantly boosts the performance of existing methods in both discrete and continuous state spaces.

Author Information

Tianpei Yang (Tianjin University, University of Alberta)
Weixun Wang (Tianjin University)
Hongyao Tang (Tianjin University)
Jianye Hao (Tianjin University)
Zhaopeng Meng (School of Computer Software, Tianjin University)
Hangyu Mao (Peking University)
Dong Li (Huawei Noah’s Ark Lab)
Wulong Liu (Huawei Noah's Ark Lab)
Yingfeng Chen
Yujing Hu (NetEase Fuxi AI Lab)
Changjie Fan (NetEase Fuxi AI Lab)
Chengwei Zhang (Dalian maritime university)

More from the Same Authors