Keynote #9 Genie 3: A new frontier for world models
Abstract
Genie 3 is a general-purpose world model that can generate an unprecedented diversity of interactive environments from a single text prompt. This marks a significant advance from static video generation to fully interactive simulations of worlds.
Our model is the first foundation world model that allows real-time interaction at 720p resolution and a consistent 24 frames per second. Genie 3 maintains consistency for minutes of continuous interaction, showing marked improvements in realism and coherence over previous-generation models. Furthermore, Genie 3 introduces “promptable world events”, allowing users to model counterfactuals and alter the state of the world with text prompts on the fly. Genie 3 also demonstrates a rich understanding of the world, capable of modeling complex physical properties such as water and lighting, simulating natural ecosystems, and generating imaginative fictional and animated worlds.
We believe that world models are a key stepping stone along the path to AGI. By making it possible to train AI agents in an unlimited curriculum of rich simulation environments, Genie 3 opens a new frontier for research in embodied AI and general-purpose agent development.