The Lagrangian Method for Solving Locally Constrained Markov Games
Abstract
We introduce a dual gradient descent method to solve Markov games subject to local but coupled constraints. These constrained Markov games (CMGs) model multiagent systems operating in dynamic environments where each agent seeks to maximize its individual utility while respecting specific local constraints. These constraints, which capture real-world limitations such as resource budgets or safety requirements, are local in the sense that they pertain to individual agents but are coupled through dependencies on other agents' actions and the shared environment state. Our approach builds on the concept of the unconstrained Lagrangian game, which agents repeatedly solve based on the current values of their own local Lagrange multipliers. Agents simulate trajectories to estimate cumulative rewards and constraint violations, then adjust the multipliers using a dual gradient step grounded in their local observations. This primal-dual process produces a sequence of policy profiles that converge to a nonstationary and feasible Nash equilibrium of the original constrained game. This work extends previous frameworks for solving Markov games with global constraints, providing a principled approach to safe multiagent learning under localized, interdependent feasibility conditions.