Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Deep Generative Models and Downstream Applications

VAEs meet Diffusion Models: Efficient and High-Fidelity Generation

Kushagra Pandey · Avideep Mukherjee · Piyush Rai · Abhishek Kumar


Abstract:

Diffusion Probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand, Variational Autoencoders (VAEs) have access to a low-dimensional latent space but, despite recent advances, exhibit poor sample quality. We present VAEDM, a novel generative framework for \textit{refining} VAE generated samples using diffusion models while also presenting a novel conditional forward process parameterization for diffusion models. We show that the resulting parameterization can improve upon the unconditional diffusion model in terms of sampling efficiency during inference while also equipping diffusion models with the low-dimensional VAE inferred latent code. Furthermore, we show that the proposed model exhibits out-of-the-box capabilities for downstream tasks like image superresolution and denoising.