Scaling Multi-Agent Coordination Through Explicit Intention Communication
Abstract
The scaling of environmental complexity is a critical frontier for advancing multi-agent intelligence. As environments grow in size, dimensionality, and partial observability, agents require sophisticated coordination mechanisms to maintain performance. This paper investigates the role of communication in such scaled environments by comparing two distinct strategies in a cooperative multi-agent reinforcement learning (MARL) task: an emergent protocol and an engineered, intention-based protocol. For the emergent approach, we introduce Learned Direct Communication (LDC), where agents equipped with unique neural network weights learn to generate and interpret messages end-to-end. For the engineered approach, we propose Intention Communication, a structured architecture featuring an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) that allows agents to explicitly formulate and share forward-looking plans. We evaluate these strategies in a partially observable grid world, progressively scaling the environment's size. Our findings reveal that while emergent communication is viable in simpler settings, its performance degrades sharply as the environment scales. In contrast, the engineered Intention Communication approach demonstrates remarkable robustness, sample efficiency, and high performance, maintaining near-optimal success rates even in significantly larger and more challenging environments. This work underscores that for agents to succeed in increasingly complex, scaled-up interactive settings, structured and explicit coordination mechanisms may be fundamentally more scalable than purely emergent protocols.