Timezone: »

Improving Variational Autoencoders with Inverse Autoregressive Flow
Diederik Kingma · Tim Salimans · Rafal Jozefowicz · Peter Chen · Xi Chen · Ilya Sutskever · Max Welling

Mon Dec 05 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #83

We propose a simple and scalable method for improving the flexibility of variational inference through a transformation with autoregressive neural networks. Autoregressive neural networks, such as RNNs or the PixelCNN, are very powerful models and potentially interesting for use as variational posterior approximation. However, ancestral sampling in such networks is a long sequential operation, and therefore typically very slow on modern parallel hardware, such as GPUs. We show that by inverting autoregressive neural networks we can obtain equally powerful posterior models from which we can sample efficiently on modern hardware. We show that such data transformations, inverse autoregressive flows (IAF), can be used to transform a simple distribution over the latent variables into a much more flexible distribution, while still allowing us to compute the resulting variables' probability density function. The method is simple to implement, can be made arbitrarily flexible and, in contrast with previous work, is well applicable to models with high-dimensional latent spaces, such as convolutional generative models. The method is applied to a novel deep architecture of variational auto-encoders. In experiments with natural images, we demonstrate that autoregressive flow leads to significant performance gains.

Author Information

Diederik Kingma (Google)
Tim Salimans (Algoritmica)
Rafal Jozefowicz (OpenAI)
Peter Chen (UC Berkeley and OpenAI)
Xi Chen (UC Berkeley and OpenAI)

Xi Chen is an associate professor with tenure at Stern School of Business at New York University, who is also an affiliated professor to Computer Science and Center for Data Science. Before that, he was a Postdoc in the group of Prof. Michael Jordan at UC Berkeley. He obtained his Ph.D. from the Machine Learning Department at Carnegie Mellon University (CMU). He studies high-dimensional statistical learning, online learning, large-scale stochastic optimization, and applications to operations. He has published more than 20 journal articles in statistics, machine learning, and operations, and 30 top machine learning peer-reviewed conference proceedings. He received NSF Career Award, ICSA Outstanding Young Researcher Award, Faculty Research Awards from Google, Adobe, Alibaba, and Bloomberg, and was featured in Forbes list of “30 Under30 in Science”.

Ilya Sutskever (Google)
Max Welling (Microsoft Research AI4Science / University of Amsterdam)

More from the Same Authors