Timezone: »
The deep reinforcement learning (RL) framework has shown great promise to tackle sequential decision-making problems, where the agent learns to behave optimally through interactions with the environment and receiving rewards. The ability of an RL agent to learn different reward functions concurrently has many benefits, such as the decomposition of task rewards and promoting skill reuse. In this paper, we consider the problem of continuous control for robot manipulation tasks with an explicit representation that promotes skill reuse while learning multiple tasks with similar reward functions. Our approach relies on two key concepts: successor features (SFs), a value function representation that decouples the dynamics of the environment from the rewards, and an actor-critic framework that incorporates the learned SFs representation.SFs form a natural bridge between model-based and model-free RL methods. We first show how to learn a decomposable representation required by SFs as a pre-training stage. The proposed architecture is able to learn decoupled state and reward feature representations for non-linear reward functions. We then evaluate the feasibility of integrating SFs into an actor-critic framework, which is more tailored for tasks solved with deep RL algorithms. The approach is empirically tested on non-trivial continuous control problems with compositional structure built into the reward functions of the tasks.
Author Information
Melissa Mozifian (Mila)
Dieter Fox (University of Washington)
David Meger (McGill University)
Fabio Ramos (University of Sydney, NVIDIA)
Animesh Garg (University of Toronto, Nvidia, Vector Institute)
I am a CIFAR AI Chair Assistant Professor of Computer Science at the University of Toronto, a Faculty Member at the Vector Institute, and Sr. Researcher at Nvidia. My current research focuses on machine learning for perception and control in robotics.
More from the Same Authors
-
2021 : Tutorial: Safe Learning for Decision Making »
Angela Schoellig · SiQi Zhou · Lukas Brunke · Animesh Garg · Melissa Greeff · Somil Bansal -
2021 : IL-flOw: Imitation Learning from Observation using Normalizing Flows »
Wei-Di Chang · Juan Camilo Gamboa Higuera · Scott Fujimoto · David Meger · Gregory Dudek -
2021 : Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World Trifinger »
Arthur Allshire · Mayank Mittal · Varun Lodaya · Viktor Makoviychuk · Denys Makoviichuk · Felix Widmaier · Manuel Wuethrich · Stefan Bauer · Ankur Handa · Animesh Garg -
2021 : Learning Discrete Neural Reaction Class to Improve Retrosynthesis Prediction »
Théophile Gaudin · Animesh Garg · Alan Aspuru-Guzik -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2022 : CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation »
Adithyavairavan Murali · Arsalan Mousavian · Clemens Eppner · Adam Fishman · Dieter Fox -
2022 : ProgPrompt: Generating Situated Robot Task Plans using Large Language Models »
Ishika Singh · Valts Blukis · Arsalan Mousavian · Ankit Goyal · Danfei Xu · Jonathan Tremblay · Dieter Fox · Jesse Thomason · Animesh Garg -
2022 : Bayesian Q-learning With Imperfect Expert Demonstrations »
Fengdi Che · Xiru Zhu · Doina Precup · David Meger · Gregory Dudek -
2022 : Bayesian Q-learning With Imperfect Expert Demonstrations »
Fengdi Che · Xiru Zhu · Doina Precup · David Meger · Gregory Dudek -
2022 : Variance Reduction in Off-Policy Deep Reinforcement Learning using Spectral Normalization »
Payal Bawa · Rafael Oliveira · Fabio Ramos -
2022 : Debate: Robotics for Good »
Karol Hausman · Katherine Driggs-Campbell · Luca Carlone · Sarah Dean · Matthew Johnson-Roberson · Animesh Garg -
2022 : Panel: Uncertainty-Aware Machine Learning for Robotics (Q&A 1) »
Georgia Chalvatzaki · Stefanie Tellex · Animesh Garg -
2022 Workshop: 5th Robot Learning Workshop: Trustworthy Robotics »
Alex Bewley · Roberto Calandra · Anca Dragan · Igor Gilitschenski · Emily Hannigan · Masha Itkina · Hamidreza Kasaei · Jens Kober · Danica Kragic · Nathan Lambert · Julien PEREZ · Fabio Ramos · Ransalu Senanayake · Jonathan Tompson · Vincent Vanhoucke · Markus Wulfmeier -
2022 Workshop: The Symbiosis of Deep Learning and Differential Equations II »
Michael Poli · Winnie Xu · Estefany Kelly Buchanan · Maryam Hosseini · Luca Celotti · Martin Magill · Ermal Rrapaj · Qiyao Wei · Stefano Massaroli · Patrick Kidger · Archis Joglekar · Animesh Garg · David Duvenaud -
2022 Spotlight: Batch Bayesian optimisation via density-ratio estimation with guarantees »
Rafael Oliveira · Louis Tiao · Fabio Ramos -
2022 Poster: Batch Bayesian optimisation via density-ratio estimation with guarantees »
Rafael Oliveira · Louis Tiao · Fabio Ramos -
2022 Poster: Continuous MDP Homomorphisms and Homomorphic Policy Gradient »
Sahand Rezaei-Shoshtari · Rosie Zhao · Prakash Panangaden · David Meger · Doina Precup -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : Theme B Introduction »
Animesh Garg -
2021 Workshop: Deployable Decision Making in Embodied Systems (DDM) »
Angela Schoellig · Animesh Garg · Somil Bansal · SiQi Zhou · Melissa Greeff · Lukas Brunke -
2021 Workshop: The Symbiosis of Deep Learning and Differential Equations »
Luca Celotti · Kelly Buchanan · Jorge Ortiz · Patrick Kidger · Stefano Massaroli · Michael Poli · Lily Hu · Ermal Rrapaj · Martin Magill · Thorsteinn Jonsson · Animesh Garg · Murtadha Aldeer -
2021 : Safe RL Debate »
Sylvia Herbert · Animesh Garg · Emma Brunskill · Aleksandra Faust · Dylan Hadfield-Menell -
2021 : Safe RL Panel Discussion »
Animesh Garg · Marek Petrik · Shie Mannor · Claire Tomlin · Ugo Rosolia · Dylan Hadfield-Menell -
2021 Poster: Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers »
Mikita Dvornik · Isma Hadji · Konstantinos Derpanis · Animesh Garg · Allan Jepson -
2021 Poster: Neural Hybrid Automata: Learning Dynamics With Multiple Modes and Stochastic Transitions »
Michael Poli · Stefano Massaroli · Luca Scimeca · Sanghyuk Chun · Seong Joon Oh · Atsushi Yamashita · Hajime Asama · Jinkyoo Park · Animesh Garg -
2021 Poster: Dynamic Bottleneck for Robust Self-Supervised Exploration »
Chenjia Bai · Lingxiao Wang · Lei Han · Animesh Garg · Jianye Hao · Peng Liu · Zhaoran Wang -
2020 : Invited Talk - "RL with Sim2Real in the Loop / Online Domain Adaptation for Mapping" »
Fabio Ramos · Anthony Tompkins -
2020 : Discussion Panel »
Pete Florence · Dorsa Sadigh · Carolina Parada · Jeannette Bohg · Roberto Calandra · Peter Stone · Fabio Ramos -
2020 : Bayesian optimization by density ratio estimation »
Louis Tiao · Aaron Klein · Cedric Archambeau · Edwin Bonilla · Matthias W Seeger · Fabio Ramos -
2020 Poster: Sparse Spectrum Warped Input Measures for Nonstationary Kernel Learning »
Anthony Tompkins · Rafael Oliveira · Fabio Ramos -
2020 Poster: Causal Discovery in Physical Systems from Videos »
Yunzhu Li · Antonio Torralba · Anima Anandkumar · Dieter Fox · Animesh Garg -
2020 Poster: Curriculum By Smoothing »
Samarth Sinha · Animesh Garg · Hugo Larochelle -
2020 Spotlight: Curriculum By Smoothing »
Samarth Sinha · Animesh Garg · Hugo Larochelle -
2020 Poster: An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay »
Scott Fujimoto · David Meger · Doina Precup -
2020 Poster: 3D Shape Reconstruction from Vision and Touch »
Edward Smith · Roberto Calandra · Adriana Romero · Georgia Gkioxari · David Meger · Jitendra Malik · Michal Drozdzal -
2020 Poster: Counterfactual Data Augmentation using Locally Factored Dynamics »
Silviu Pitis · Elliot Creager · Animesh Garg -
2020 Session: Orals & Spotlights Track 06: Dynamical Sys/Density/Sparsity »
Animesh Garg · Rose Yu -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Poster Session »
Lili Yu · Aleksei Kroshnin · Alex Delalande · Andrew Carr · Anthony Tompkins · Aram-Alexandre Pooladian · Arnaud Robert · Ashok Vardhan Makkuva · Aude Genevay · Bangjie Liu · Bo Zeng · Charlie Frogner · Elsa Cazelles · Esteban G Tabak · Fabio Ramos · François-Pierre PATY · Georgios Balikas · Giulio Trigila · Hao Wang · Hinrich Mahler · Jared Nielsen · Karim Lounici · Kyle Swanson · Mukul Bhutani · Pierre Bréchet · Piotr Indyk · samuel cohen · Stefanie Jegelka · Tao Wu · Thibault Sejourne · Tudor Manole · Wenjun Zhao · Wenlin Wang · Wenqi Wang · Yonatan Dukler · Zihao Wang · Chaosheng Dong -
2018 : Poster Session »
Sujay Sanghavi · Vatsal Shah · Yanyao Shen · Tianchen Zhao · Yuandong Tian · Tomer Galanti · Mufan Li · Gilad Cohen · Daniel Rothchild · Aristide Baratin · Devansh Arpit · Vagelis Papalexakis · Michael Perlmutter · Ashok Vardhan Makkuva · Pim de Haan · Yingyan Lin · Wanmo Kang · Cheolhyoung Lee · Hao Shen · Sho Yaida · Dan Roberts · Nadav Cohen · Philippe Casgrain · Dejiao Zhang · Tengyu Ma · Avinash Ravichandran · Julian Emilio Salazar · Bo Li · Davis Liang · Christopher Wong · Glen Bigan Mbeng · Animesh Garg -
2018 : Fabio Ramos (Uni. of Sydney): Learning and Planning in Spatial-Temporal Data »
Fabio Ramos -
2018 Workshop: Modeling and decision-making in the spatiotemporal domain »
Ransalu Senanayake · Neal Jean · Fabio Ramos · Girish Chowdhary -
2018 Poster: Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models »
Amir Dezfouli · Richard Morris · Fabio Ramos · Peter Dayan · Bernard Balleine -
2018 Oral: Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models »
Amir Dezfouli · Richard Morris · Fabio Ramos · Peter Dayan · Bernard Balleine -
2018 Poster: Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation »
Edward Smith · Scott Fujimoto · David Meger -
2016 Poster: Spatio-Temporal Hilbert Maps for Continuous Occupancy Representation in Dynamic Environments »
Ransalu Senanayake · Lionel Ott · Simon O'Callaghan · Fabio Ramos -
2014 Poster: On Integrated Clustering and Outlier Detection »
Lionel Ott · Linsey Pang · Fabio Ramos · Sanjay Chawla -
2011 Poster: Hierarchical Matching Pursuit for Recognition: Architecture and Fast Algorithms »
Liefeng Bo · Xiaofeng Ren · Dieter Fox -
2010 Spotlight: Kernel Descriptors for Visual Recognition »
Liefeng Bo · Xiaofeng Ren · Dieter Fox -
2010 Poster: Kernel Descriptors for Visual Recognition »
Liefeng Bo · Xiaofeng Ren · Dieter Fox