Timezone: »
Multi-Agent Reinforcement Learning (MARL) has demonstrated significant suc2 cess by virtue of collaboration across agents. Recent work, on the other hand, introduces surprise which quantifies the degree of change in an agent’s environ4 ment. Surprise-based learning has received significant attention in the case of single-agent entropic settings but remains an open problem for fast-paced dynamics in multi-agent scenarios. A potential alternative to address surprise may be realized through the lens of free-energy minimization. We explore surprise minimization in multi-agent learning by utilizing the free energy across all agents in a multi-agent system. A temporal Energy-Based Model (EBM) represents an estimate of surprise which is minimized over the joint agent distribution. Our formulation of the EBM is theoretically akin to the minimum conjugate entropy objective and highlights suitable convergence towards minimum surprising states. We further validate our theoretical claims in an empirical study of multi-agent tasks demanding collabora14 tion in the presence of fast-paced dynamics. Our implementation and agent videos are available at the Project Webpage.
Author Information
Karush Suri (University of Toronto)
Xiao Qi Shi (University of Toronto)
Konstantinos N Plataniotis (UofT)
Yuri Lawryshyn (University of Toronto)
More from the Same Authors
-
2021 : Towards Robust and Automatic Hyper-Parameter Tunning »
Mathieu Tuli · Mahdi Hosseini · Konstantinos N Plataniotis -
2021 : NoFADE: Analyzing Diminishing Returns on CO2 Investment »
Andre Fu · Justin Tran · Andy Xie · Jonathan Spraggett · Elisa Ding · Chang-Won Lee · Kanav Singla · Mahdi Hosseini · Konstantinos N Plataniotis -
2022 Spotlight: Surprise Minimizing Multi-Agent Learning with Energy-based Models »
Karush Suri -
2021 : Fairness:: P4AI: Approaching AI Ethics through Principlism »
Mahdi Hosseini · Konstantinos N Plataniotis