Scaling Environments for LLM Agents: Fundamentals, Approaches, and Future Directions
Abstract
There is a growing consensus that LLM agents evolve by learning from interaction experience rather than static data, to acquire abilities such as higher-order reasoning, long-term planning, and adaptation. This shift reframes the environment from a passive container into an active data producer that generates tasks and provides feedback based on the agent’s solutions. In this survey, we systematize the paradigm of scaling environments for LLM agents. We introduce a unified taxonomy with two primary dimensions: task generation and feedback provision. Beyond this taxonomy, we review evaluation benchmarks, implementation frameworks, and applications, and identify open challenges that remain unresolved. By consolidating existing progress and highlighting future directions, our survey establishes environment scaling as a foundation for sustained agent evolution.