Poster
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure
Xiang Li · Yixiang Dai · Qing Qu
East Exhibit Hall A-C #2808
Recently, diffusion models have emerged as a highly effective new class of deep generative models, demonstrating exceptional generation performance. In this work, we study the generalizability (i.e., be able to generate new samples) of diffusion models by looking into the hidden properties of the learned score functions, which are essentially a series of deep denoisers trained on various noise levels. Notably, we observe that the nonlinear diffusion denoisers exhibit strong linearity when the diffusion model is able to generalize. This discovery leads us to approximate their function mappings with linear models, which serve as the first-order approximation of the nonlinear diffusion denoisers. Surprisingly, these linear denoisers are approximately the optimal denoisers for a multivariate Gaussian distribution characterized by the empirical mean and covariance of the training dataset. This finding implies that the diffusion models have the inductive bias towards capturing and utilizing the Gaussian structure (covariance information) of the training dataset for data generation. Our experiment results show that this inductive bias becomes more pronounced when the model capacity is relatively small compared to the size of the training dataset. However, even the model is highly overparameterized, this inductive bias emerges during the initial training phases before the model fully memorizes its training data. Our study provides crucial insights into understanding the notable strong generalizability recently observed in real-world diffusion models.
Live content is unavailable. Log in and register to view live content