Timezone: »
Curiosity-based reward schemes can present powerful exploration mechanisms which facilitate the discovery of solutions for complex, sparse or long-horizon tasks. However, as the agent learns to reach previously unexplored spaces and the objective adapts to reward new areas, many behaviours emerge only to disappear due to being overwritten by the constantly shifting objective. We argue that merely using curiosity for fast environment exploration or as a bonus reward for a specific task does not harness the full potential of this technique and misses useful skills. Instead, we propose to shift the focus towards retaining the behaviours which emerge during curiosity-based learning. We posit that these self-discovered behaviours serve as valuable skills in an agent's repertoire to solve related tasks. Our experiments demonstrate the continuous shift in behaviour throughout training and the benefits of a simple policy snapshot method to reuse discovered behaviour for transfer tasks.
Author Information
Oliver Groth (University of Oxford)
Markus Wulfmeier (DeepMind)
Giulia Vezzani (Google DeepMind)
Vibhavari Dasagi (Queensland University of Technology)
Tim Hertweck (DeepMind)
Roland Hafner (Google DeepMind)
Nicolas Heess (Google DeepMind)
Martin Riedmiller (DeepMind)
More from the Same Authors
-
2021 : Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies »
Dushyant Rao · Fereshteh Sadeghi · Leonard Hasenclever · Markus Wulfmeier · Martina Zambelli · Giulia Vezzani · Dhruva Tirumala · Yusuf Aytar · Josh Merel · Nicolas Heess · Raia Hadsell -
2021 : Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots »
Krishan Rana · Vibhavari Dasagi · Michael Milford · Niko Suenderhauf -
2021 : Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots »
Krishan Rana · Vibhavari Dasagi · Michael Milford · Niko Suenderhauf -
2021 : Offline Meta-Reinforcement Learning for Industrial Insertion »
Tony Zhao · Jianlan Luo · Oleg Sushkov · Rugile Pevceviciute · Nicolas Heess · Jonathan Scholz · Stefan Schaal · Sergey Levine -
2022 : Fifteen-minute Competition Overview Video »
Nico Gürtler · Georg Martius · Pavel Kolev · Sebastian Blaes · Manuel Wuethrich · Markus Wulfmeier · Cansu Sancaktar · Martin Riedmiller · Arthur Allshire · Bernhard Schölkopf · Annika Buchholz · Stefan Bauer -
2023 Poster: Coherent Soft Imitation Learning »
Joe Watson · Sandy Huang · Nicolas Heess -
2022 Competition: Real Robot Challenge III - Learning Dexterous Manipulation from Offline Data in the Real World »
Nico Gürtler · Georg Martius · Sebastian Blaes · Pavel Kolev · Cansu Sancaktar · Stefan Bauer · Manuel Wuethrich · Markus Wulfmeier · Martin Riedmiller · Arthur Allshire · Annika Buchholz · Bernhard Schölkopf -
2022 Poster: Data augmentation for efficient learning from parametric experts »
Alexandre Galashov · Josh Merel · Nicolas Heess -
2021 : Panel A: Deployable Learning Algorithms for Embodied Systems »
Shuran Song · Martin Riedmiller · Nick Roy · Aude G Billard · Angela Schoellig · SiQi Zhou -
2021 : Reinforcement Learning in Real-World Control Systems »
Martin Riedmiller -
2021 Workshop: 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning »
Alex Bewley · Masha Itkina · Hamidreza Kasaei · Jens Kober · Nathan Lambert · Julien PEREZ · Ransalu Senanayake · Vincent Vanhoucke · Markus Wulfmeier · Igor Gilitschenski -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2021 Poster: Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies »
Tim Seyde · Igor Gilitschenski · Wilko Schwarting · Bartolomeo Stellato · Martin Riedmiller · Markus Wulfmeier · Daniela Rus -
2020 Workshop: 3rd Robot Learning Workshop »
Masha Itkina · Alex Bewley · Roberto Calandra · Igor Gilitschenski · Julien PEREZ · Ransalu Senanayake · Markus Wulfmeier · Vincent Vanhoucke -
2020 Poster: Value-driven Hindsight Modelling »
Arthur Guez · Fabio Viola · Theophane Weber · Lars Buesing · Steven Kapturowski · Doina Precup · David Silver · Nicolas Heess -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces »
Sebastien Ehrhardt · Oliver Groth · Aron Monszpart · Martin Engelcke · Ingmar Posner · Niloy Mitra · Andrea Vedaldi -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2019 Workshop: Robot Learning: Control and Interaction in the Real World »
Roberto Calandra · Markus Wulfmeier · Kate Rakelly · Sanket Kamthe · Danica Kragic · Stefan Schaal · Markus Wulfmeier -
2019 Poster: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Spotlight: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Probabilistic Reasoning for Reinforcement Learning (Nicolas Heess) »
Nicolas Heess -
2018 Workshop: Infer to Control: Probabilistic Reinforcement Learning and Structured Control »
Leslie Kaelbling · Martin Riedmiller · Marc Toussaint · Igor Mordatch · Roy Fox · Tuomas Haarnoja -
2017 Workshop: Acting and Interacting in the Real World: Challenges in Robot Learning »
Ingmar Posner · Raia Hadsell · Martin Riedmiller · Markus Wulfmeier · Rohan Paul -
2017 Poster: Distral: Robust multitask reinforcement learning »
Yee Teh · Victor Bapst · Wojciech Czarnecki · John Quan · James Kirkpatrick · Raia Hadsell · Nicolas Heess · Razvan Pascanu -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: Robust Imitation of Diverse Behaviors »
Ziyu Wang · Josh Merel · Scott Reed · Nando de Freitas · Gregory Wayne · Nicolas Heess -
2017 Poster: Learning Hierarchical Information Flow with Recurrent Neural Modules »
Danijar Hafner · Alexander Irpan · James Davidson · Nicolas Heess -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2015 Poster: Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images »
Manuel Watter · Jost Springenberg · Joschka Boedecker · Martin Riedmiller -
2015 Poster: Gradient Estimation Using Stochastic Computation Graphs »
John Schulman · Nicolas Heess · Theophane Weber · Pieter Abbeel -
2015 Poster: Learning Continuous Control Policies by Stochastic Value Gradients »
Nicolas Heess · Gregory Wayne · David Silver · Timothy Lillicrap · Tom Erez · Yuval Tassa -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu