Timezone: »
We propose a simple and scalable method for improving the flexibility of variational inference through a transformation with autoregressive neural networks. Autoregressive neural networks, such as RNNs or the PixelCNN, are very powerful models and potentially interesting for use as variational posterior approximation. However, ancestral sampling in such networks is a long sequential operation, and therefore typically very slow on modern parallel hardware, such as GPUs. We show that by inverting autoregressive neural networks we can obtain equally powerful posterior models from which we can sample efficiently on modern hardware. We show that such data transformations, inverse autoregressive flows (IAF), can be used to transform a simple distribution over the latent variables into a much more flexible distribution, while still allowing us to compute the resulting variables' probability density function. The method is simple to implement, can be made arbitrarily flexible and, in contrast with previous work, is well applicable to models with high-dimensional latent spaces, such as convolutional generative models. The method is applied to a novel deep architecture of variational auto-encoders. In experiments with natural images, we demonstrate that autoregressive flow leads to significant performance gains.
Author Information
Diederik Kingma (Google)
Tim Salimans (Algoritmica)
Rafal Jozefowicz (OpenAI)
Peter Chen (UC Berkeley and OpenAI)
Xi Chen (UC Berkeley and OpenAI)
Xi Chen is an associate professor with tenure at Stern School of Business at New York University, who is also an affiliated professor to Computer Science and Center for Data Science. Before that, he was a Postdoc in the group of Prof. Michael Jordan at UC Berkeley. He obtained his Ph.D. from the Machine Learning Department at Carnegie Mellon University (CMU). He studies high-dimensional statistical learning, online learning, large-scale stochastic optimization, and applications to operations. He has published more than 20 journal articles in statistics, machine learning, and operations, and 30 top machine learning peer-reviewed conference proceedings. He received NSF Career Award, ICSA Outstanding Young Researcher Award, Faculty Research Awards from Google, Adobe, Alibaba, and Bloomberg, and was featured in Forbes list of “30 Under30 in Science”.
Ilya Sutskever (Google)
Max Welling (University of Amsterdam / Qualcomm AI Research)
More from the Same Authors
-
2020 Poster: Natural Graph Networks »
Pim de Haan · Taco Cohen · Max Welling -
2020 Poster: SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks »
Fabian Fuchs · Daniel E Worrall · Volker Fischer · Max Welling -
2020 Poster: SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows »
Didrik Nielsen · Priyank Jaini · Emiel Hoogeboom · Ole Winther · Max Welling -
2020 Oral: SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows »
Didrik Nielsen · Priyank Jaini · Emiel Hoogeboom · Ole Winther · Max Welling -
2020 Poster: ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA »
Ilyes Khemakhem · Ricardo Monti · Diederik Kingma · Aapo Hyvarinen -
2020 Poster: The Convolution Exponential and Generalized Sylvester Flows »
Emiel Hoogeboom · Victor Garcia Satorras · Jakub Tomczak · Max Welling -
2020 Poster: Bayesian Bits: Unifying Quantization and Pruning »
Mart van Baalen · Christos Louizos · Markus Nagel · Rana Ali Amjad · Ying Wang · Tijmen Blankevoort · Max Welling -
2020 Poster: Experimental design for MRI by greedy policy search »
Tim Bakker · Herke van Hoof · Max Welling -
2020 Spotlight: Experimental design for MRI by greedy policy search »
Tim Bakker · Herke van Hoof · Max Welling -
2020 Spotlight: ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA »
Ilyes Khemakhem · Ricardo Monti · Diederik Kingma · Aapo Hyvarinen -
2020 Poster: MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning »
Elise van der Pol · Daniel E Worrall · Herke van Hoof · Frans Oliehoek · Max Welling -
2019 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Eric Nalisnick · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2019 Poster: Invert to Learn to Invert »
Patrick Putzky · Max Welling -
2019 Poster: Deep Scale-spaces: Equivariance Over Scale »
Daniel Worrall · Max Welling -
2019 Poster: Integer Discrete Flows and Lossless Compression »
Emiel Hoogeboom · Jorn Peters · Rianne van den Berg · Max Welling -
2019 Poster: The Functional Neural Process »
Christos Louizos · Xiahan Shi · Klamer Schutte · Max Welling -
2019 Poster: Combining Generative and Discriminative Models for Hybrid Inference »
Victor Garcia Satorras · Zeynep Akata · Max Welling -
2019 Spotlight: Combining Generative and Discriminative Models for Hybrid Inference »
Victor Garcia Satorras · Max Welling · Zeynep Akata -
2019 Poster: Combinatorial Bayesian Optimization using the Graph Cartesian Product »
Changyong Oh · Jakub Tomczak · Efstratios Gavves · Max Welling -
2018 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2018 Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications »
Lixin Fan · Zhouchen Lin · Max Welling · Yurong Chen · Werner Bailer -
2018 Poster: Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models »
Yining Wang · Xi Chen · Yuan Zhou -
2018 Poster: Graphical Generative Adversarial Networks »
Chongxuan LI · Max Welling · Jun Zhu · Bo Zhang -
2018 Poster: The Importance of Sampling inMeta-Reinforcement Learning »
Bradly Stadie · Ge Yang · Rein Houthooft · Peter Chen · Yan Duan · Yuhuai Wu · Pieter Abbeel · Ilya Sutskever -
2018 Poster: Glow: Generative Flow with Invertible 1x1 Convolutions »
Diederik Kingma · Prafulla Dhariwal -
2018 Poster: 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data »
Maurice Weiler · Wouter Boomsma · Mario Geiger · Max Welling · Taco Cohen -
2017 Workshop: Bayesian Deep Learning »
Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Andrew Wilson · Diederik Kingma · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2017 Workshop: Advances in Approximate Bayesian Inference »
Francisco Ruiz · Stephan Mandt · Cheng Zhang · James McInerney · James McInerney · Dustin Tran · Dustin Tran · David Blei · Max Welling · Tamara Broderick · Michalis Titsias -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris M Mooij · David Sontag · Richard Zemel · Max Welling -
2017 Poster: Bayesian Compression for Deep Learning »
Christos Louizos · Karen Ullrich · Max Welling -
2016 Workshop: Bayesian Deep Learning »
Yarin Gal · Christos Louizos · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2016 Workshop: Deep Reinforcement Learning »
David Silver · Satinder Singh · Pieter Abbeel · Peter Chen -
2016 Poster: On the Recursive Teaching Dimension of VC Classes »
Peter Chen · Xi Chen · Yu Cheng · Bo Tang -
2016 Poster: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks »
Tim Salimans · Diederik Kingma -
2016 Oral: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks »
Tim Salimans · Diederik Kingma -
2016 Poster: An Online Sequence-to-Sequence Model Using Partial Conditioning »
Navdeep Jaitly · Quoc V Le · Oriol Vinyals · Ilya Sutskever · David Sussillo · Samy Bengio -
2016 Poster: InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets »
Xi Chen · Peter Chen · Yan Duan · Rein Houthooft · John Schulman · Ilya Sutskever · Pieter Abbeel -
2016 Poster: VIME: Variational Information Maximizing Exploration »
Rein Houthooft · Xi Chen · Peter Chen · Yan Duan · John Schulman · Filip De Turck · Pieter Abbeel -
2016 Poster: Improved Techniques for Training GANs »
Tim Salimans · Ian Goodfellow · Wojciech Zaremba · Vicki Cheung · Alec Radford · Peter Chen · Xi Chen -
2015 Workshop: Scalable Monte Carlo Methods for Bayesian Analysis of Big Data »
Babak Shahbaba · Yee Whye Teh · Max Welling · Arnaud Doucet · Christophe Andrieu · Sebastian J. Vollmer · Pierre Jacob -
2015 Symposium: Deep Learning Symposium »
Yoshua Bengio · Marc'Aurelio Ranzato · Honglak Lee · Max Welling · Andrew Y Ng -
2015 Poster: Bayesian dark knowledge »
Anoop Korattikara Balan · Vivek Rathod · Kevin Murphy · Max Welling -
2015 Poster: Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference »
Ted Meeds · Max Welling -
2015 Poster: Variational Dropout and the Local Reparameterization Trick »
Diederik Kingma · Tim Salimans · Max Welling -
2015 Poster: Grammar as a Foreign Language »
Oriol Vinyals · Łukasz Kaiser · Terry Koo · Slav Petrov · Ilya Sutskever · Geoffrey Hinton -
2014 Workshop: High-energy particle physics, machine learning, and the HiggsML data challenge (HEPML) »
Glen Cowan · Balázs Kégl · Kyle Cranmer · Gábor Melis · Tim Salimans · Vladimir Vava Gligorov · Daniel Whiteson · Lester Mackey · Wojciech Kotlowski · Roberto Díaz Morales · Pierre Baldi · Cecile Germain · David Rousseau · Isabelle Guyon · Tianqi Chen -
2014 Workshop: ABC in Montreal »
Max Welling · Neil D Lawrence · Richard D Wilkinson · Ted Meeds · Christian X Robert -
2014 Poster: Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing »
Yuchen Zhang · Xi Chen · Denny Zhou · Michael Jordan -
2014 Spotlight: Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing »
Yuchen Zhang · Xi Chen · Denny Zhou · Michael Jordan -
2014 Poster: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2014 Poster: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2014 Demonstration: Machine Learning in the Browser »
Ted Meeds · Remco Hendriks · Said Al Faraby · Magiel Bruntink · Max Welling -
2014 Spotlight: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2014 Oral: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2013 Workshop: Crowdsourcing: Theory, Algorithms and Applications »
Jennifer Wortman Vaughan · Greg Stoddard · Chien-Ju Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill -
2013 Workshop: Probabilistic Models for Big Data »
Neil D Lawrence · Joaquin Quiñonero Candela · Tianshi Gao · James Hensman · Zoubin Ghahramani · Max Welling · David Blei · Ralf Herbrich -
2013 Poster: Variance Reduction for Stochastic Gradient Optimization »
Chong Wang · Xi Chen · Alexander Smola · Eric Xing -
2013 Poster: Distributed Representations of Words and Phrases and their Compositionality »
Tomas Mikolov · Ilya Sutskever · Kai Chen · Greg Corrado · Jeff Dean -
2012 Poster: Optimal Regularized Dual Averaging Methods for Stochastic Optimization »
Xi Chen · Qihang Lin · Javier Pena -
2012 Poster: Clustering by Nonnegative Matrix Factorization Using Graph Random Walk »
Zhirong Yang · Tele Hao · Onur Dikmen · Xi Chen · Erkki Oja -
2012 Poster: The Time-Marginalized Coalescent Prior for Hierarchical Clustering »
Levi Boyles · Max Welling -
2011 Poster: Statistical Tests for Optimization Efficiency »
Levi Boyles · Anoop Korattikara · Deva Ramanan · Max Welling -
2010 Spotlight: Graph-Valued Regression »
Han Liu · Xi Chen · John Lafferty · Larry Wasserman -
2010 Poster: Multivariate Dyadic Regression Trees for Sparse Learning Problems »
Han Liu · Xi Chen -
2010 Poster: On Herding and the Perceptron Cycling Theorem »
Andrew E Gelfand · Yutian Chen · Laurens van der Maaten · Max Welling -
2010 Poster: Graph-Valued Regression »
Han Liu · Xi Chen · John Lafferty · Larry Wasserman -
2010 Poster: Regularized estimation of image statistics by Score Matching »
Diederik Kingma · Yann LeCun -
2009 Poster: Nonparametric Greedy Algorithms for the Sparse Learning Problem »
Han Liu · Xi Chen -
2008 Session: Oral session 10: Nonparametric Processes, Scene Processing and Image Statistics »
Max Welling -
2008 Poster: Asynchronous Distributed Learning of Topic Models »
Arthur Asuncion · Padhraic Smyth · Max Welling -
2007 Spotlight: Collapsed Variational Inference for HDP »
Yee Whye Teh · Kenichi Kurihara · Max Welling -
2007 Spotlight: Distributed Inference for Latent Dirichlet Allocation »
David Newman · Arthur Asuncion · Padhraic Smyth · Max Welling -
2007 Poster: Infinite State Bayes-Nets for Structured Domains »
Max Welling · Ian Porteous · Evgeniy Bart -
2007 Poster: Collapsed Variational Inference for HDP »
Yee Whye Teh · Kenichi Kurihara · Max Welling -
2007 Poster: Distributed Inference for Latent Dirichlet Allocation »
David Newman · Arthur Asuncion · Padhraic Smyth · Max Welling -
2007 Spotlight: Infinite State Bayes-Nets for Structured Domains »
Max Welling · Ian Porteous · Evgeniy Bart -
2006 Poster: Structure Learning in Markov Random Fields »
Sridevi Parise · Max Welling -
2006 Poster: Accelerated Variational Dirichlet Process Mixtures »
Kenichi Kurihara · Max Welling · Nikos Vlassis -
2006 Spotlight: Accelerated Variational Dirichlet Process Mixtures »
Kenichi Kurihara · Max Welling · Nikos Vlassis -
2006 Poster: A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation »
Yee Whye Teh · David Newman · Max Welling