Timezone: »

ResQ: A Residual Q Function-based Approach for Multi-Agent Reinforcement Learning Value Factorization
Siqi Shen · Mengwei Qiu · Jun Liu · Weiquan Liu · Yongquan Fu · Xinwang Liu · Cheng Wang

Thu Dec 01 09:00 AM -- 11:00 AM (PST) @ Hall J #402

The factorization of state-action value functions for Multi-Agent Reinforcement Learning (MARL) is important. Existing studies are limited by their representation capability, sample efficiency, and approximation error. To address these challenges, we propose, ResQ, a MARL value function factorization method, which can find the optimal joint policy for any state-action value function through residual functions. ResQ masks some state-action value pairs from a joint state-action value function, which is transformed as the sum of a main function and a residual function. ResQ can be used with mean-value and stochastic-value RL. We theoretically show that ResQ can satisfy both the individual global max (IGM) and the distributional IGM principle without representation limitations. Through experiments on matrix games, the predator-prey, and StarCraft benchmarks, we show that ResQ can obtain better results than multiple expected/stochastic value factorization methods.

Author Information

Siqi Shen (Xiamen University)
Mengwei Qiu (Xiamen University)
Jun Liu (Xiamen University)
Weiquan Liu (Xiamen University)
Yongquan Fu (National University of Defense Technology)

I am an associate professor (Master Supervisor) in National Key Laboratory for Parallel and Distributed Processing & College of Computer Science, at National University of Defense Technology. My research focuses on network measurement and performance optimization for data center and geo-distributed networking systems. I am particularly interested in solving problems motivated by Online Data-intensitve applications, online social networks and large-scale data.

Xinwang Liu (National University of Defense Technology)
Cheng Wang (Xiamen University, Tsinghua University)

More from the Same Authors