Inference-time Scaling of Diffusion Models through Classical Search
Abstract
Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models—adapting generated outputs to meet diverse test-time objectives—using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It performs compute-efficient global exploration using breadth-first and depth-first tree search and employs a theoretically grounded scalable local search via annealed Langevin MCMC. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency over baseline methods. These results demonstrate that classical search offers a principled, practical foundation for inference-time scaling in diffusion models, and that our method, which jointly scales local and global search, establishes a new Pareto frontier.