Timezone: »
Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment.
Author Information
Ziyu Wang (Deepmind)
Josh Merel (DeepMind)
Scott Reed (Google DeepMind)
Nando de Freitas (DeepMind)
Gregory Wayne (Google DeepMind)
Nicolas Heess (Google DeepMind)
More from the Same Authors
-
2021 : Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration »
Oliver Groth · Markus Wulfmeier · Giulia Vezzani · Vibhavari Dasagi · Tim Hertweck · Roland Hafner · Nicolas Heess · Martin Riedmiller -
2021 : Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies »
Dushyant Rao · Fereshteh Sadeghi · Leonard Hasenclever · Markus Wulfmeier · Martina Zambelli · Giulia Vezzani · Dhruva Tirumala · Yusuf Aytar · Josh Merel · Nicolas Heess · Raia Hadsell -
2021 : Offline Meta-Reinforcement Learning for Industrial Insertion »
Tony Zhao · Jianlan Luo · Oleg Sushkov · Rugile Pevceviciute · Nicolas Heess · Jonathan Scholz · Stefan Schaal · Sergey Levine -
2022 : Multi-step Planning for Automated Hyperparameter Optimization with OptFormer »
Lucio M Dery · Abram Friesen · Nando de Freitas · Marc'Aurelio Ranzato · Yutian Chen -
2023 Poster: Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis »
Alexander Meulemans · Simon Schug · Seijin Kobayashi · nathaniel daw · Gregory Wayne -
2023 Poster: Coherent Soft Imitation Learning »
Joe Watson · Sandy Huang · Nicolas Heess -
2022 Poster: Intra-agent speech permits zero-shot task acquisition »
Chen Yan · Federico Carnevale · Petko I Georgiev · Adam Santoro · Aurelia Guy · Alistair Muldal · Chia-Chun Hung · Joshua Abramson · Timothy Lillicrap · Gregory Wayne -
2022 Poster: Towards Learning Universal Hyperparameter Optimizers with Transformers »
Yutian Chen · Xingyou Song · Chansoo Lee · Zi Wang · Richard Zhang · David Dohan · Kazuya Kawakami · Greg Kochanski · Arnaud Doucet · Marc'Aurelio Ranzato · Sagi Perel · Nando de Freitas -
2022 Poster: Data augmentation for efficient learning from parametric experts »
Alexandre Galashov · Josh Merel · Nicolas Heess -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 : Offline RL »
Nando de Freitas -
2020 Poster: Value-driven Hindsight Modelling »
Arthur Guez · Fabio Viola · Theophane Weber · Lars Buesing · Steven Kapturowski · Doina Precup · David Silver · Nicolas Heess -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Poster: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Spotlight: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: Gaussian Gated Linear Networks »
David Budden · Adam Marblestone · Eren Sezener · Tor Lattimore · Gregory Wayne · Joel Veness -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2019 Workshop: Science meets Engineering of Deep Learning »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 : Welcoming remarks and introduction »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 Poster: Interval timing in deep reinforcement learning agents »
Ben Deverett · Ryan Faulkner · Meire Fortunato · Gregory Wayne · Joel Leibo -
2019 Poster: Experience Replay for Continual Learning »
David Rolnick · Arun Ahuja · Jonathan Richard Schwarz · Timothy Lillicrap · Gregory Wayne -
2019 Poster: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Poster: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2019 Spotlight: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Spotlight: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Probabilistic Reasoning for Reinforcement Learning (Nicolas Heess) »
Nicolas Heess -
2018 : TBA 5 »
Nando de Freitas -
2018 : Invited Talk 5: Nando de Freitas »
Nando de Freitas -
2018 Poster: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Spotlight: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Poster: Neural Arithmetic Logic Units »
Andrew Trask · Felix Hill · Scott Reed · Jack Rae · Chris Dyer · Phil Blunsom -
2018 Poster: Learning Attractor Dynamics for Generative Memory »
Yan Wu · Gregory Wayne · Karol Gregor · Timothy Lillicrap -
2017 Poster: Distral: Robust multitask reinforcement learning »
Yee Teh · Victor Bapst · Wojciech Czarnecki · John Quan · James Kirkpatrick · Raia Hadsell · Nicolas Heess · Razvan Pascanu -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: Learning Hierarchical Information Flow with Recurrent Neural Modules »
Danijar Hafner · Alexander Irpan · James Davidson · Nicolas Heess -
2017 Tutorial: Deep Learning: Practice and Trends »
Nando de Freitas · Scott Reed · Oriol Vinyals -
2016 Workshop: Neural Abstract Machines & Program Induction »
Matko Bošnjak · Nando de Freitas · Tejas Kulkarni · Arvind Neelakantan · Scott E Reed · Sebastian Riedel · Tim Rocktäschel -
2016 : Summary/Goodbye »
Tarek R. Besold · Artur Garcez · Antoine Bordes · Gregory Wayne -
2016 : Nando De Freitas »
Nando de Freitas -
2016 : Welcome/Opening »
Tarek R. Besold · Antoine Bordes · Gregory Wayne · Artur Garcez -
2016 : Learning To Optimize »
Nando de Freitas -
2016 Workshop: Cognitive Computation: Integrating Neural and Symbolic Approaches »
Tarek R. Besold · Antoine Bordes · Gregory Wayne · Artur Garcez -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Poster: Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes »
Jack Rae · Jonathan J Hunt · Ivo Danihelka · Tim Harley · Andrew Senior · Gregory Wayne · Alex Graves · Timothy Lillicrap -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2016 Poster: Learning to learn by gradient descent by gradient descent »
Marcin Andrychowicz · Misha Denil · Sergio Gómez · Matthew Hoffman · David Pfau · Tom Schaul · Nando de Freitas -
2015 Workshop: Bayesian Optimization: Scalability and Flexibility »
Bobak Shahriari · Ryan Adams · Nando de Freitas · Amar Shah · Roberto Calandra -
2015 : Discussion Panel with Afternoon Speakers (Day 1) »
Ramanathan Guha · Antoine Bordes · Gregory Wayne -
2015 : How Can We Direct Our Agents? »
Gregory Wayne -
2015 Poster: Gradient Estimation Using Stochastic Computation Graphs »
John Schulman · Nicolas Heess · Theophane Weber · Pieter Abbeel -
2015 Poster: Learning Continuous Control Policies by Stochastic Value Gradients »
Nicolas Heess · Gregory Wayne · David Silver · Timothy Lillicrap · Tom Erez · Yuval Tassa -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu