Timezone: »
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (DIStill & TRAnsfer Learning). Instead of sharing parameters between the different workers, we propose to share a distilled policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.
Author Information
Yee Teh (DeepMind)
Victor Bapst (DeepMind)
Wojciech Czarnecki (DeepMind)
John Quan (Google DeepMind)
James Kirkpatrick (Google DeepMind)
Raia Hadsell (DeepMind)
Nicolas Heess (Google DeepMind)
Razvan Pascanu (Google DeepMind)
More from the Same Authors
-
2021 : LiRo: Benchmark and leaderboard for Romanian language tasks »
Stefan Dumitrescu · Petru Rebeja · Beata Lorincz · Mihaela Gaman · Andrei Avram · Mihai Ilie · Andrei Pruteanu · Adriana Stan · Lorena Rosia · Cristina Iacobescu · Luciana Morogan · George Dima · Gabriel Marchidan · Traian Rebedea · Madalina Chitez · Dani Yogatama · Sebastian Ruder · Radu Tudor Ionescu · Razvan Pascanu · Viorica Patraucean -
2021 : Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration »
Oliver Groth · Markus Wulfmeier · Giulia Vezzani · Vibhavari Dasagi · Tim Hertweck · Roland Hafner · Nicolas Heess · Martin Riedmiller -
2021 : Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies »
Dushyant Rao · Fereshteh Sadeghi · Leonard Hasenclever · Markus Wulfmeier · Martina Zambelli · Giulia Vezzani · Dhruva Tirumala · Yusuf Aytar · Josh Merel · Nicolas Heess · Raia Hadsell -
2021 : Uncertainty Quantification in End-to-End Implicit Neural Representations for Medical Imaging »
Francisca Vasconcelos · Bobby He · Yee Teh -
2021 : Offline Meta-Reinforcement Learning for Industrial Insertion »
Tony Zhao · Jianlan Luo · Oleg Sushkov · Rugile Pevceviciute · Nicolas Heess · Jonathan Scholz · Stefan Schaal · Sergey Levine -
2022 : Pre-training via Denoising for Molecular Property Prediction »
Sheheryar Zaidi · Michael Schaarschmidt · James Martens · Hyunjik Kim · Yee Whye Teh · Alvaro Sanchez Gonzalez · Peter Battaglia · Razvan Pascanu · Jonathan Godwin -
2022 : Learning to Look by Self-Prediction »
Matthew Grimes · Joseph Modayil · Piotr Mirowski · Dushyant Rao · Raia Hadsell -
2022 : When Does Re-initialization Work? »
Sheheryar Zaidi · Tudor Berariu · Hyunjik Kim · Jorg Bornschein · Claudia Clopath · Yee Whye Teh · Razvan Pascanu -
2022 Poster: Disentangling Transfer in Continual Reinforcement Learning »
Maciej Wolczyk · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś -
2022 Poster: Data augmentation for efficient learning from parametric experts »
Alexandre Galashov · Josh Merel · Nicolas Heess -
2022 Poster: The Phenomenon of Policy Churn »
Tom Schaul · Andre Barreto · John Quan · Georg Ostrovski -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 Poster: On Contrastive Representations of Stochastic Processes »
Emile Mathieu · Adam Foster · Yee Teh -
2021 Poster: Group Equivariant Subsampling »
Jin Xu · Hyunjik Kim · Thomas Rainforth · Yee Teh -
2021 Poster: Powerpropagation: A sparsity inducing weight reparameterisation »
Jonathan Richard Schwarz · Siddhant Jayakumar · Razvan Pascanu · Peter E Latham · Yee Teh -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2021 Poster: Continual World: A Robotic Benchmark For Continual Reinforcement Learning »
Maciej Wołczyk · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś -
2021 Poster: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations »
Tim G. J. Rudner · Cong Lu · Michael A Osborne · Yarin Gal · Yee Teh -
2021 Poster: On the Role of Optimization in Double Descent: A Least Squares Study »
Ilja Kuzborskij · Csaba Szepesvari · Omar Rivasplata · Amal Rannen-Triki · Razvan Pascanu -
2021 Poster: Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels »
Michael Hutchinson · Alexander Terenin · Viacheslav Borovitskiy · So Takao · Yee Teh · Marc Deisenroth -
2021 Poster: BayesIMP: Uncertainty Quantification for Causal Data Fusion »
Siu Lun Chau · Jean-Francois Ton · Javier González · Yee Teh · Dino Sejdinovic -
2021 Poster: Neural Ensemble Search for Uncertainty Estimation and Dataset Shift »
Sheheryar Zaidi · Arber Zela · Thomas Elsken · Chris C Holmes · Frank Hutter · Yee Teh -
2020 Poster: Top-KAST: Top-K Always Sparse Training »
Siddhant Jayakumar · Razvan Pascanu · Jack Rae · Simon Osindero · Erich Elsen -
2020 Poster: Discovering Reinforcement Learning Algorithms »
Junhyuk Oh · Matteo Hessel · Wojciech Czarnecki · Zhongwen Xu · Hado van Hasselt · Satinder Singh · David Silver -
2020 Poster: Value-driven Hindsight Modelling »
Arthur Guez · Fabio Viola · Theophane Weber · Lars Buesing · Steven Kapturowski · Doina Precup · David Silver · Nicolas Heess -
2020 Poster: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Spotlight: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Tutorial: (Track3) Designing Learning Dynamics Q&A »
Marta Garnelo · David Balduzzi · Wojciech Czarnecki -
2020 Poster: Understanding the Role of Training Regimes in Continual Learning »
Seyed Iman Mirzadeh · Mehrdad Farajtabar · Razvan Pascanu · Hassan Ghasemzadeh -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: Real World Games Look Like Spinning Tops »
Wojciech Czarnecki · Gauthier Gidel · Brendan Tracey · Karl Tuyls · Shayegan Omidshafiei · David Balduzzi · Max Jaderberg -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2020 Tutorial: (Track3) Designing Learning Dynamics »
Marta Garnelo · David Balduzzi · Wojciech Czarnecki -
2019 : Invited Talk - Raia Hadsell »
Raia Hadsell -
2019 : Challenges of Deep RL in Complex Environments »
Raia Hadsell -
2019 : Scalable Meta-Learning »
Raia Hadsell -
2019 : Raia Hadsell »
Raia Hadsell -
2019 : Opening session - Competition track »
Hugo Jair Escalante · Raia Hadsell -
2019 Poster: Continual Unsupervised Representation Learning »
Dushyant Rao · Francesco Visin · Andrei A Rusu · Razvan Pascanu · Yee Whye Teh · Raia Hadsell -
2019 Poster: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2019 Spotlight: Hindsight Credit Assignment »
Anna Harutyunyan · Will Dabney · Thomas Mesnard · Mohammad Gheshlaghi Azar · Bilal Piot · Nicolas Heess · Hado van Hasselt · Gregory Wayne · Satinder Singh · Doina Precup · Remi Munos -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Probabilistic Reasoning for Reinforcement Learning (Nicolas Heess) »
Nicolas Heess -
2018 : Invited Speaker #2 Raia Hadsell »
Raia Hadsell -
2018 : Introduction of the workshop »
Razvan Pascanu · Yee Teh · Mark Ring · Marc Pickett -
2018 Workshop: Continual Learning »
Razvan Pascanu · Yee Teh · Marc Pickett · Mark Ring -
2018 Poster: Learning to Navigate in Cities Without a Map »
Piotr Mirowski · Matt Grimes · Mateusz Malinowski · Karl Moritz Hermann · Keith Anderson · Denis Teplyashin · Karen Simonyan · koray kavukcuoglu · Andrew Zisserman · Raia Hadsell -
2018 Poster: Relational recurrent neural networks »
Adam Santoro · Ryan Faulkner · David Raposo · Jack Rae · Mike Chrzanowski · Theophane Weber · Daan Wierstra · Oriol Vinyals · Razvan Pascanu · Timothy Lillicrap -
2017 Workshop: Acting and Interacting in the Real World: Challenges in Robot Learning »
Ingmar Posner · Raia Hadsell · Martin Riedmiller · Markus Wulfmeier · Rohan Paul -
2017 Poster: A simple neural network module for relational reasoning »
Adam Santoro · David Raposo · David Barrett · Mateusz Malinowski · Razvan Pascanu · Peter Battaglia · Timothy Lillicrap -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Spotlight: A simple neural network module for relational reasoning »
Adam Santoro · David Raposo · David Barrett · Mateusz Malinowski · Razvan Pascanu · Peter Battaglia · Timothy Lillicrap -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Visual Interaction Networks: Learning a Physics Simulator from Video »
Nicholas Watters · Daniel Zoran · Theophane Weber · Peter Battaglia · Razvan Pascanu · Andrea Tacchetti -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: Sobolev Training for Neural Networks »
Wojciech Czarnecki · Simon Osindero · Max Jaderberg · Grzegorz Swirszcz · Razvan Pascanu -
2017 Poster: Robust Imitation of Diverse Behaviors »
Ziyu Wang · Josh Merel · Scott Reed · Nando de Freitas · Gregory Wayne · Nicolas Heess -
2017 Poster: Learning Hierarchical Information Flow with Recurrent Neural Modules »
Danijar Hafner · Alexander Irpan · James Davidson · Nicolas Heess -
2016 Workshop: Continual Learning and Deep Networks »
Razvan Pascanu · Mark Ring · Tom Schaul -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Poster: The Forget-me-not Process »
Kieran Milan · Joel Veness · James Kirkpatrick · Michael Bowling · Anna Koop · Demis Hassabis -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2016 Poster: Interaction Networks for Learning about Objects, Relations and Physics »
Peter Battaglia · Razvan Pascanu · Matthew Lai · Danilo Jimenez Rezende · koray kavukcuoglu -
2015 Poster: Natural Neural Networks »
Guillaume Desjardins · Karen Simonyan · Razvan Pascanu · koray kavukcuoglu -
2015 Poster: Gradient Estimation Using Stochastic Computation Graphs »
John Schulman · Nicolas Heess · Theophane Weber · Pieter Abbeel -
2015 Poster: Learning Continuous Control Policies by Stochastic Value Gradients »
Nicolas Heess · Gregory Wayne · David Silver · Timothy Lillicrap · Tom Erez · Yuval Tassa -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Poster: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization »
Yann N Dauphin · Razvan Pascanu · Caglar Gulcehre · Kyunghyun Cho · Surya Ganguli · Yoshua Bengio -
2014 Poster: On the Number of Linear Regions of Deep Neural Networks »
Guido F Montufar · Razvan Pascanu · Kyunghyun Cho · Yoshua Bengio