Fictitious Play in Product Markov Games With Kullback-Leibler Control Cost
Khaled Nakhleh · Sarper Aydin · Ceyhun Eksin · Sabit Ekin
Abstract
We present and analyze fictitious play for a new class of product Markov games with a Kullback-Leibler (KL) control cost. In a product Markov game, state transitions are the product of $n$ Markov transition functions, where each agent controls its own local state transition dynamics given the common state and incurs a KL control cost for their efforts. Fictitious play entails each agent best-responding to minimize its discounted sum of instantaneous costs, that depend on KL control cost and a state cost, given local beliefs about other agents' policies. Agents update their beliefs about other agents' policies upon observation of the realized states. We show that the fictitious play converges asymptotically to a Nash equilibrium of a product Markov game. Simulation results on a multi-agent shortest path problem with collision avoidance on a grid confirm the convergence result and demonstrate its speed.
Chat is not available.
Successful Page Load