Static and Dynamic Diffusion Emulators: From Sampling Gray Swan Extreme Events to Suffering from Model Collapse
Abstract
Characterizing rare extreme events is a major challenge in climate and other sciences, and AI emulators may offer a solution: The idea is that an emulator trained on a small dataset can generate large synthetic datasets with extreme events beyond those that were absent from the training set. In principle. AI cannot generalize beyond the training set (i.e., extrapolate), however, because the governing equations are fixed, there are speculations that generative models, especially those that learn from fast dynamics, can reproduce stronger events from the same dynamical system. Here, we investigate the most promising approaches using 2D geophysical turbulence test cases. First, we train an unconditional static emulator that samples plausible flow states from noise; second, an autoregressive dynamic emulator; and third, iteratively augment the training dataset using the emulator’s output. We show that both static and dynamic emulators can produce extreme events stronger than those in the training set, effectively creating “gray swans”. However, neither could accurately reproduce the frequency (return period) of these rare events. We show that both emulators are constrained by the small training set, and trying to increase the data size by iteratively training emulators on their own output leads to model collapse - a degenerative feedback loop that progressively worsens the quality of generated samples and extreme estimates. Overall, these findings indicate that purely data-driven diffusion models are not capable of learning the underlying data-generation process from a small training set.