Timezone: »
Deep reinforcement learning (DRL) has achieved super-human performance on complex video games (e.g., StarCraft II and Dota II). However, current DRL systems still suffer from challenges of multi-agent coordination, sparse rewards, stochastic environments, etc. In seeking to address these challenges, we employ a football video game, e.g., Google Research Football (GRF), as our testbed and develop an end-to-end learning-based AI system (denoted as TiKick) to complete this challenging task. In this work, we first generated a large replay dataset from the self-playing of single-agent experts, which are obtained from league training. We then developed a new offline algorithm to learn a powerful multi-agent AI from the fixed single-agent dataset. To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios. Extensive experiments further show that our pre-trained model can accelerate the training process of the modern multi-agent algorithm and our method achieves state-of-the-art performances on various academic scenarios.
Author Information
Shiyu Huang (Tsinghua University)
I am a fifth-year Ph.D. student in the Department of Computer Science and Technology, Tsinghua University, China, advised by Prof. Jun Zhu and Prof. Ting Chen. My research interests lie on the intersection of computer vision, reinforcement learning and deep learning. I have also spent time working at Huawei Noah's Ark Lab, Tencent AI Lab, Carnegie Mellon University and Sensetime Inc. . And I am also the founder of the TARTRL group.
Wenze Chen (Tsinghua University)
Longfei Zhang (National University of Defense Technology)
Shizhen Xu (RealAI)
Ziyang Li (Tencent AI Lab)
Fengming Zhu (Tencent AI Lab)
Deheng Ye (Tencent)
Ting Chen (Tsinghua University)
Jun Zhu (Tsinghua University)
More from the Same Authors
-
2022 Poster: Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning »
Hua Wei · Jingxiao Chen · Xiyang Ji · Hongyang Qin · Minwen Deng · Siqin Li · Liang Wang · Weinan Zhang · Yong Yu · Liu Linc · Lanxiao Huang · Deheng Ye · Qiang Fu · Wei Yang -
2021 Poster: Coordinated Proximal Policy Optimization »
Zifan Wu · Chao Yu · Deheng Ye · Junge Zhang · haiyin piao · Hankz Hankui Zhuo -
2021 Poster: Learning Diverse Policies in MOBA Games via Macro-Goals »
Yiming Gao · Bei Shi · Xueying Du · Liang Wang · Guangwei Chen · Zhenjie Lian · Fuhao Qiu · GUOAN HAN · Weixuan Wang · Deheng Ye · Qiang Fu · Wei Yang · Lanxiao Huang -
2020 Poster: Towards Playing Full MOBA Games with Deep Reinforcement Learning »
Deheng Ye · Guibin Chen · Wen Zhang · Sheng Chen · Bo Yuan · Bo Liu · Jia Chen · Zhao Liu · Fuhao Qiu · Hongsheng Yu · Yinyuting Yin · Bei Shi · Liang Wang · Tengfei Shi · Qiang Fu · Wei Yang · Lanxiao Huang · Wei Liu -
2018 Poster: Semi-crowdsourced Clustering with Deep Generative Models »
Yucen Luo · TIAN TIAN · Jiaxin Shi · Jun Zhu · Bo Zhang -
2018 Poster: Stochastic Expectation Maximization with Variance Reduction »
Jianfei Chen · Jun Zhu · Yee Whye Teh · Tong Zhang