Timezone: »

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Sang Tong · YAN ZHENG · Jianye Hao · Matthew Taylor · Jinyi Liu

@

Learning to collaborate is critical in multi-agent reinforcement learning (MARL). Several recent works propose to promote collaboration by maximizing the correlation of agents’ behaviors, which is typically characterised by mutual information (MI) in different forms. Generally, high MI signifies a high collaboration level. The correlation of agents’ behavior, typically characterised by mutual information (MI), is an important measure of the agents' collaboration level. Generally, high collaboration level corresponds to high MI. However, simply maximizing the MI of agents’ behaviors cannot guarantee to achieve better collaboration because sub-optimal collaboration can also lead to high MI. To this end, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), to facilitate collaboration efficiently and stably. Firstly, we first introduce Dual Progressive Collaboration Buffer (DPCB) which separately stores the superior and inferior samples in a progressive manner. Then we train two MI estimators: one is to maximize the MI associated with superior collaboration to improve agents' policies, the other is to minimize the MI associated with inferior collaboration to prevent from falling into local optimal. Finally, our \alg is general and can be combined with existing MARL algorithms, and experiments on several MARL benchmarks, show the superior performance compared with other MARL algorithms.