Evolve to Inspire: Novelty Search for Diverse Image Generation
Abstract
Text-to-image diffusion models are capable of producing diverse outputs, yet discovering prompts that consistently elicit diversity remains challenging. Re-using or lightly editing a prompt often produces near duplicate generations, limiting their utility for exploration and ideation. We present WANDER, a novelty search-based framework that evolves prompts to produce diverse image sets from a single input. WANDER uses a Large Language Model (LLM) to mutate prompts, guided by semantic “emitters” such as altering style, composition, or atmosphere, Image novelty is quantified using CLIP embeddings, ensuring that each generation expands the diversity of the pool while remaining semantically coherent. Experiments with FLUX-DEV for generation and GPT-4o-mini for mutation show that WANDER produces significantly more diverse image sets than existing prompt optimization baselines, while using fewer tokens. Ablations highlight that emitter-guided control is essential for achieving diversity. By framing diversity as a controllable property, WANDER offers a practical, scalable tool for creative exploration with generative models.