Poster
Modelling the Dynamics of Multiagent Q-Learning in Repeated Symmetric Games: a Mean Field Theoretic Approach
Shuyue Hu · Chin-wing Leung · Ho-fung Leung
East Exhibition Hall B, C #200
Keywords: [ Statistical Physics of Learning ] [ Theory ] [ Reinforcement Learning and Planning ] [ Multi-Agent RL ]
Modelling the dynamics of multi-agent learning has long been an important research topic, but all of the previous works focus on 2-agent settings and mostly use evolutionary game theoretic approaches. In this paper, we study an n-agent setting with n tends to infinity, such that agents learn their policies concurrently over repeated symmetric bimatrix games with some other agents. Using mean field theory, we approximate the effects of other agents on a single agent by an averaged effect. A Fokker-Planck equation that describes the evolution of the probability distribution of Q-values in the agent population is derived. To the best of our knowledge, this is the first time to show the Q-learning dynamics under an n-agent setting can be described by a system of only three equations. We validate our model through comparisons with agent-based simulations on typical symmetric bimatrix games and different initial settings of Q-values.
Live content is unavailable. Log in and register to view live content