Timezone: »
When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results. Inspired by this, we consider the extension of the ELBO to a family of lower bounds defined by a particle filter's estimator of the marginal likelihood, the filtering variational objectives (FIVOs). FIVOs take the same arguments as the ELBO, but can exploit a model's sequential structure to form tighter bounds. We present results that relate the tightness of FIVO's bound to the variance of the particle filter's estimator by considering the generic case of bounds defined as log-transformed likelihood estimators. Experimentally, we show that training with FIVO results in substantial improvements over training the same model architecture with the ELBO on sequential data.
Author Information
Chris Maddison (Oxford)
John Lawson (Google Brain)
George Tucker (Google Brain)
Nicolas Heess (Google DeepMind)
Mohammad Norouzi (Google Brain)
Andriy Mnih (DeepMind)
Arnaud Doucet (Oxford)
Yee Teh (DeepMind)
More from the Same Authors
-
2021 : Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration »
Oliver Groth · Markus Wulfmeier · Giulia Vezzani · Vibhavari Dasagi · Tim Hertweck · Roland Hafner · Nicolas Heess · Martin Riedmiller -
2021 : Palette: Image-to-Image Diffusion Models »
Chitwan Saharia · William Chan · Huiwen Chang · Chris Lee · Jonathan Ho · Tim Salimans · David Fleet · Mohammad Norouzi -
2021 : Optimal Representations for Covariate Shifts »
Yann Dubois · Yangjun Ruan · Chris Maddison -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : Offline Policy Selection under Uncertainty »
Mengjiao (Sherry) Yang · Bo Dai · Ofir Nachum · George Tucker · Dale Schuurmans -
2021 : Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies »
Dushyant Rao · Fereshteh Sadeghi · Leonard Hasenclever · Markus Wulfmeier · Martina Zambelli · Giulia Vezzani · Dhruva Tirumala · Yusuf Aytar · Josh Merel · Nicolas Heess · Raia Hadsell -
2021 : Gaussian dropout as an information bottleneck layer »
Melanie Rey · Andriy Mnih -
2021 : Uncertainty Quantification in End-to-End Implicit Neural Representations for Medical Imaging »
Francisca Vasconcelos · Bobby He · Yee Teh -
2021 : Palette: Image-to-Image Diffusion Models »
Chitwan Saharia · William Chan · Huiwen Chang · Chris Lee · Jonathan Ho · Tim Salimans · David Fleet · Mohammad Norouzi -
2021 : Offline Meta-Reinforcement Learning for Industrial Insertion »
Tony Zhao · Jianlan Luo · Oleg Sushkov · Rugile Pevceviciute · Nicolas Heess · Jonathan Scholz · Stefan Schaal · Sergey Levine -
2022 : Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes »
Aviral Kumar · Rishabh Agarwal · XINYANG GENG · George Tucker · Sergey Levine -
2022 : Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios »
Yiren Lu · Yiren Lu · Yiren Lu · Justin Fu · George Tucker · Xinlei Pan · Eli Bronstein · Rebecca Roelofs · Benjamin Sapp · Brandyn White · Aleksandra Faust · Shimon Whiteson · Dragomir Anguelov · Sergey Levine -
2022 : Score Modeling for Simulation-based Inference »
Tomas Geffner · George Papamakarios · Andriy Mnih -
2022 : Spectral Diffusion Processes »
Angus Phillips · Thomas Seror · Michael Hutchinson · Valentin De Bortoli · Arnaud Doucet · Emile Mathieu -
2022 : Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes »
Aviral Kumar · Rishabh Agarwal · XINYANG GENG · George Tucker · Sergey Levine -
2022 Spotlight: Lightning Talks 1A-4 »
Siwei Wang · Jing Liu · Nianqiao Ju · Shiqian Li · Eloïse Berthier · Muhammad Faaiz Taufiq · Arsene Fansi Tchango · Chen Liang · Chulin Xie · Jordan Awan · Jean-Francois Ton · Ziad Kobeissi · Wenguan Wang · Xinwang Liu · Kewen Wu · Rishab Goel · Jiaxu Miao · Suyuan Liu · Julien Martel · Ruobin Gong · Francis Bach · Chi Zhang · Rob Cornish · Sanmi Koyejo · Zhi Wen · Yee Whye Teh · Yi Yang · Jiaqi Jin · Bo Li · Yixin Zhu · Vinayak Rao · Wenxuan Tu · Gaetan Marceau Caron · Arnaud Doucet · Xinzhong Zhu · Joumana Ghosn · En Zhu -
2022 Spotlight: Conformal Off-Policy Prediction in Contextual Bandits »
Muhammad Faaiz Taufiq · Jean-Francois Ton · Rob Cornish · Yee Whye Teh · Arnaud Doucet -
2022 : Invited Talk: Mohammad Norouzi »
Mohammad Norouzi -
2022 : Interactive Industrial Panel »
Jiahao Sun · Ahmed Ibrahim · Marjan Ghazvininejad · Yu Cheng · Boxing Chen · Mohammad Norouzi · Rahul Gupta -
2022 Workshop: 3rd Offline Reinforcement Learning Workshop: Offline RL as a "Launchpad" »
Aviral Kumar · Rishabh Agarwal · Aravind Rajeswaran · Wenxuan Zhou · George Tucker · Doina Precup · Aviral Kumar -
2022 Poster: Oracle Inequalities for Model Selection in Offline Reinforcement Learning »
Jonathan N Lee · George Tucker · Ofir Nachum · Bo Dai · Emma Brunskill -
2022 Poster: Conformal Off-Policy Prediction in Contextual Bandits »
Muhammad Faaiz Taufiq · Jean-Francois Ton · Rob Cornish · Yee Whye Teh · Arnaud Doucet -
2022 Poster: A Continuous Time Framework for Discrete Denoising Models »
Andrew Campbell · Joe Benton · Valentin De Bortoli · Thomas Rainforth · George Deligiannidis · Arnaud Doucet -
2022 Poster: Score-Based Diffusion meets Annealed Importance Sampling »
Arnaud Doucet · Will Grathwohl · Alexander Matthews · Heiko Strathmann -
2022 Poster: A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs »
Fabian Falck · Christopher Williams · Dominic Danks · George Deligiannidis · Christopher Yau · Chris C Holmes · Arnaud Doucet · Matthew Willetts -
2022 Poster: Riemannian Score-Based Generative Modelling »
Valentin De Bortoli · Emile Mathieu · Michael Hutchinson · James Thornton · Yee Whye Teh · Arnaud Doucet -
2022 Poster: Towards Learning Universal Hyperparameter Optimizers with Transformers »
Yutian Chen · Xingyou Song · Chansoo Lee · Zi Wang · Richard Zhang · David Dohan · Kazuya Kawakami · Greg Kochanski · Arnaud Doucet · Marc'Aurelio Ranzato · Sagi Perel · Nando de Freitas -
2022 Poster: Data augmentation for efficient learning from parametric experts »
Alexandre Galashov · Josh Merel · Nicolas Heess -
2021 : Speaker Intro »
Aviral Kumar · George Tucker -
2021 : Speaker Intro »
Aviral Kumar · George Tucker -
2021 : Invited Speaker Panel »
Sham Kakade · Minmin Chen · Philip Thomas · Angela Schoellig · Barbara Engelhardt · Doina Precup · George Tucker -
2021 Workshop: Offline Reinforcement Learning »
Rishabh Agarwal · Aviral Kumar · George Tucker · Justin Fu · Nan Jiang · Doina Precup · Aviral Kumar -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 : DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Tengyu Ma · Aaron Courville · George Tucker · Sergey Levine -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 Poster: On Contrastive Representations of Stochastic Processes »
Emile Mathieu · Adam Foster · Yee Teh -
2021 Poster: Group Equivariant Subsampling »
Jin Xu · Hyunjik Kim · Thomas Rainforth · Yee Teh -
2021 Poster: Powerpropagation: A sparsity inducing weight reparameterisation »
Jonathan Richard Schwarz · Siddhant Jayakumar · Razvan Pascanu · Peter E Latham · Yee Teh -
2021 Poster: Coupled Gradient Estimators for Discrete Latent Variables »
Zhe Dong · Andriy Mnih · George Tucker -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2021 Poster: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations »
Tim G. J. Rudner · Cong Lu · Michael A Osborne · Yarin Gal · Yee Teh -
2021 Poster: Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels »
Michael Hutchinson · Alexander Terenin · Viacheslav Borovitskiy · So Takao · Yee Teh · Marc Deisenroth -
2021 Poster: BayesIMP: Uncertainty Quantification for Causal Data Fusion »
Siu Lun Chau · Jean-Francois Ton · Javier González · Yee Teh · Dino Sejdinovic -
2021 Poster: Neural Ensemble Search for Uncertainty Estimation and Dataset Shift »
Sheheryar Zaidi · Arber Zela · Thomas Elsken · Chris C Holmes · Frank Hutter · Yee Teh -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 Workshop: Offline Reinforcement Learning »
Aviral Kumar · Rishabh Agarwal · George Tucker · Lihong Li · Doina Precup · Aviral Kumar -
2020 : Introduction »
Aviral Kumar · George Tucker · Rishabh Agarwal -
2020 Poster: Value-driven Hindsight Modelling »
Arthur Guez · Fabio Viola · Theophane Weber · Lars Buesing · Steven Kapturowski · Doina Precup · David Silver · Nicolas Heess -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: DisARM: An Antithetic Gradient Estimator for Binary Latent Variables »
Zhe Dong · Andriy Mnih · George Tucker -
2020 Spotlight: DisARM: An Antithetic Gradient Estimator for Binary Latent Variables »
Zhe Dong · Andriy Mnih · George Tucker -
2020 Poster: Conservative Q-Learning for Offline Reinforcement Learning »
Aviral Kumar · Aurick Zhou · George Tucker · Sergey Levine -
2020 : Policy Panel »
Roya Pakzad · Dia Kayyali · Marzyeh Ghassemi · Shakir Mohamed · Mohammad Norouzi · Ted Pedersen · Anver Emon · Abubakar Abid · Darren Byler · Samhaa R. El-Beltagy · Nayel Shafei · Mona Diab -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2020 Affinity Workshop: Muslims in ML »
Marzyeh Ghassemi · Mohammad Norouzi · Shakir Mohamed · Aya Salama · Tasmie Sarker -
2019 Poster: Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction »
Aviral Kumar · Justin Fu · George Tucker · Sergey Levine -
2019 Poster: Energy-Inspired Models: Learning with Sampler-Induced Distributions »
Dieterich Lawson · George Tucker · Bo Dai · Rajesh Ranganath -
2019 Poster: Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse »
James Lucas · George Tucker · Roger Grosse · Mohammad Norouzi -
2019 Poster: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Spotlight: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Poster: Augmented Neural ODEs »
Emilien Dupont · Arnaud Doucet · Yee Whye Teh -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Probabilistic Reasoning for Reinforcement Learning (Nicolas Heess) »
Nicolas Heess -
2018 : Introduction of the workshop »
Razvan Pascanu · Yee Teh · Mark Ring · Marc Pickett -
2018 Workshop: Continual Learning »
Razvan Pascanu · Yee Teh · Marc Pickett · Mark Ring -
2018 Poster: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion »
Jacob Buckman · Danijar Hafner · George Tucker · Eugene Brevdo · Honglak Lee -
2018 Oral: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion »
Jacob Buckman · Danijar Hafner · George Tucker · Eugene Brevdo · Honglak Lee -
2018 Poster: Implicit Reparameterization Gradients »
Mikhail Figurnov · Shakir Mohamed · Andriy Mnih -
2018 Spotlight: Implicit Reparameterization Gradients »
Mikhail Figurnov · Shakir Mohamed · Andriy Mnih -
2018 Poster: Hamiltonian Variational Auto-Encoder »
Anthony Caterini · Arnaud Doucet · Dino Sejdinovic -
2017 Poster: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Poster: Bridging the Gap Between Value and Policy Based Reinforcement Learning »
Ofir Nachum · Mohammad Norouzi · Kelvin Xu · Dale Schuurmans -
2017 Poster: Distral: Robust multitask reinforcement learning »
Yee Teh · Victor Bapst · Wojciech Czarnecki · John Quan · James Kirkpatrick · Raia Hadsell · Nicolas Heess · Razvan Pascanu -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Oral: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Poster: Variational Memory Addressing in Generative Models »
Jörg Bornschein · Andriy Mnih · Daniel Zoran · Danilo Jimenez Rezende -
2017 Poster: Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling »
Andrei-Cristian Barbos · Francois Caron · Jean-François Giovannelli · Arnaud Doucet -
2017 Poster: Robust Imitation of Diverse Behaviors »
Ziyu Wang · Josh Merel · Scott Reed · Nando de Freitas · Gregory Wayne · Nicolas Heess -
2017 Poster: Learning Hierarchical Information Flow with Recurrent Neural Modules »
Danijar Hafner · Alexander Irpan · James Davidson · Nicolas Heess -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2015 Workshop: Scalable Monte Carlo Methods for Bayesian Analysis of Big Data »
Babak Shahbaba · Yee Whye Teh · Max Welling · Arnaud Doucet · Christophe Andrieu · Sebastian J. Vollmer · Pierre Jacob -
2015 Poster: Gradient Estimation Using Stochastic Computation Graphs »
John Schulman · Nicolas Heess · Theophane Weber · Pieter Abbeel -
2015 Poster: Learning Continuous Control Policies by Stochastic Value Gradients »
Nicolas Heess · Gregory Wayne · David Silver · Timothy Lillicrap · Tom Erez · Yuval Tassa -
2015 Poster: Expectation Particle Belief Propagation »
Thibaut Lienart · Yee Whye Teh · Arnaud Doucet -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Poster: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Oral: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2009 Poster: Bayesian Nonparametric Models on Decomposable Graphs »
Francois Caron · Arnaud Doucet -
2009 Tutorial: Sequential Monte-Carlo Methods »
Arnaud Doucet · Nando de Freitas -
2007 Spotlight: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra -
2007 Poster: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra