A Constrained Multi-Agent Reinforcement Learning Approach to Autonomous Traffic Signal Control
Abstract
Traffic congestion persists due to the inability of traditional fixed-time signal systems to adapt to dynamic traffic conditions. While Adaptive Traffic Signal Control (ATSC) methods adjust signals in real time, they often fail to incorporate real-world constraints such as fairness, safety, and operational feasibility. We frame ATSC as a constrained multi-agent reinforcement learning (MARL) problem and propose MAPPO-LCE, which integrates a Lagrange Cost Estimator to stabilize constraint optimization alongside three novel real-world constraints: GreenTime, GreenSkip, and PhaseSkip. Experiments across three real-world datasets show that MAPPO-LCE outperforms baseline MARL methods and constrained MARL methods in sample complexity and constraint satisfaction. Our results demonstrate that constrained MARL enables scalable and practical ATSC deployment in real traffic networks.