Timezone: »
Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning -- policy pretraining, targeted exploration, and oversampling of demonstration data -- which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 160%-250%more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100K interaction steps, 5 demonstrations).
Author Information
Nicklas Hansen (UC San Diego)
Yixin Lin (Facebook AI Research)
Hao Su (UCSD)
Xiaolong Wang (UC San Diego)
Vikash Kumar (FAIR, Meta-AI)

I am currently a research scientist at Facebook AI Research (FAIR). I have also spent some time at Google-Brain, OpenAI and Berkeley Artificial Intelligence Research (BAIR) Lab. I did my PhD at CSE, University of Washington's Movement Control Lab, under the supervision of Prof. Emanuel Todorov and Prof. Sergey Levine. I am interested in the areas of Robotics, and Embodied Artificial Intelligence. My general interest lies in developing artificial agents that are cheap, portable and exhibit complex behaviors.
Aravind Rajeswaran (FAIR)
More from the Same Authors
-
2021 : RB2: Robotic Manipulation Benchmarking with a Twist »
Sudeep Dasari · Jianren Wang · Joyce Hong · Shikhar Bahl · Yixin Lin · Austin Wang · Abitha Thankaraj · Karanbir Chahal · Berk Calli · Saurabh Gupta · David Held · Lerrel Pinto · Deepak Pathak · Vikash Kumar · Abhinav Gupta -
2021 : ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations »
Tongzhou Mu · Zhan Ling · Fanbo Xiang · Derek Yang · Xuanlin Li · Stone Tao · Zhiao Huang · Zhiwei Jia · Hao Su -
2021 : From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation »
Yuzhe Qin · Hao Su · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation »
Rishabh Jangir · Nicklas Hansen · Xiaolong Wang -
2021 : CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery »
Misha Laskin · Hao Liu · Xue Bin Peng · Denis Yarats · Aravind Rajeswaran · Pieter Abbeel -
2021 : Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL »
Catherine Cang · Aravind Rajeswaran · Pieter Abbeel · Misha Laskin -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Huazhe Xu · Xiaolong Wang -
2021 : Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization »
Chieko Imai · Minghao Zhang · Ruihan Yang · Yuzhe Qin · Xiaolong Wang -
2021 : Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation »
Rishabh Jangir · Nicklas Hansen · Mohit Jain · Xiaolong Wang -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Real World Offline Reinforcement Learning with Realistic Data Source »
Gaoyue Zhou · Liyiming Ke · Siddhartha Srinivasa · Abhinav Gupta · Aravind Rajeswaran · Vikash Kumar -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Offline Reinforcement Learning on Real Robot with Realistic Data Sources »
Gaoyue Zhou · Liyiming Ke · Siddhartha Srinivasa · Abhinav Gupta · Aravind Rajeswaran · Vikash Kumar -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset »
Yang Fu · Xiaolong Wang -
2022 : Fifteen-minute Competition Overview Video »
Guillaume Durandau · Yuval Tassa · Vittorio Caggiano · Vikash Kumar · Seungmoon Song · Massimo Sartori · -
2022 : Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-Grasps »
Sudeep Dasari · Vikash Kumar -
2022 : Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation »
Yuzhe Qin · Binghao Huang · Zhao-Heng Yin · Hao Su · Xiaolong Wang -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Offline Reinforcement Learning on Real Robot with Realistic Data Sources »
Gaoyue Zhou · Liyiming Ke · Siddhartha Srinivasa · Abhinav Gupta · Aravind Rajeswaran · Vikash Kumar -
2022 : Policy Architectures for Compositional Generalization in Control »
Allan Zhou · Vikash Kumar · Chelsea Finn · Aravind Rajeswaran -
2022 : Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization »
Stone Tao · Xiaochen Li · Tongzhou Mu · Zhiao Huang · Yuzhe Qin · Hao Su -
2022 : VARIATIONAL REPARAMETRIZED POLICY LEARNING WITH DIFFERENTIABLE PHYSICS »
Zhiao Huang · Litian Liang · Zhan Ling · Xuanlin Li · Chuang Gan · Hao Su -
2022 : Multi-skill Mobile Manipulation for Object Rearrangement »
Jiayuan Gu · Devendra Singh Chaplot · Hao Su · Jitendra Malik -
2022 : Visual Reinforcement Learning with Self-Supervised 3D Representations »
Yanjie Ze · Nicklas Hansen · Yinbo Chen · Mohit Jain · Xiaolong Wang -
2022 : Graph Inverse Reinforcement Learning from Diverse Videos »
Sateesh Kumar · Jonathan Zamora · Nicklas Hansen · Rishabh Jangir · Xiaolong Wang -
2022 : On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning »
yifan xu · Nicklas Hansen · Zirui Wang · Yung-Chieh Chan · Hao Su · Zhuowen Tu -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Real World Offline Reinforcement Learning with Realistic Data Source »
Gaoyue Zhou · Liyiming Ke · Siddhartha Srinivasa · Abhinav Gupta · Aravind Rajeswaran · Vikash Kumar -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 Competition: MyoChallenge: Learning contact-rich manipulation using a musculoskeletal hand »
Vittorio Caggiano · · Guillaume Durandau · Seungmoon Song · Yuval Tassa · Massimo Sartori · Vikash Kumar -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 : Train Offline, Test Online: A Real Robot Learning Benchmark »
Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta -
2022 Workshop: Self-Supervised Learning: Theory and Practice »
Ishan Misra · Pengtao Xie · Gul Varol · Yale Song · Yuki Asano · Xiaolong Wang · Pauline Luc -
2022 Workshop: 3rd Offline Reinforcement Learning Workshop: Offline RL as a "Launchpad" »
Aviral Kumar · Rishabh Agarwal · Aravind Rajeswaran · Wenxuan Zhou · George Tucker · Doina Precup · Aviral Kumar -
2022 Poster: Unsupervised Reinforcement Learning with Contrastive Intrinsic Control »
Michael Laskin · Hao Liu · Xue Bin Peng · Denis Yarats · Aravind Rajeswaran · Pieter Abbeel -
2022 Poster: Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset »
Yang Fu · Xiaolong Wang -
2021 : Efficient and Interpretable Robot Manipulation with Graph Neural Networks »
Yixin Lin · Austin Wang · Eric Undersander · Akshara Rai -
2021 Poster: Visual Adversarial Imitation Learning using Variational Models »
Rafael Rafailov · Tianhe Yu · Aravind Rajeswaran · Chelsea Finn -
2021 Poster: COMBO: Conservative Offline Model-Based Policy Optimization »
Tianhe Yu · Aviral Kumar · Rafael Rafailov · Aravind Rajeswaran · Sergey Levine · Chelsea Finn -
2021 Poster: Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation »
Nicklas Hansen · Hao Su · Xiaolong Wang -
2021 Poster: Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Misha Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2021 Poster: Particle Cloud Generation with Message Passing Generative Adversarial Networks »
Raghav Kansal · Javier Duarte · Hao Su · Breno Orzari · Thiago Tomei · Maurizio Pierini · Mary Touranakou · jean-roch vlimant · Dimitrios Gunopulos -
2021 Poster: Reinforcement Learning with Latent Flow »
Wenling Shang · Xiaofei Wang · Aravind Srinivas · Aravind Rajeswaran · Yang Gao · Pieter Abbeel · Misha Laskin -
2020 Workshop: Object Representations for Learning and Reasoning »
William Agnew · Rim Assouel · Michael Chang · Antonia Creswell · Eliza Kosoy · Aravind Rajeswaran · Sjoerd van Steenkiste -
2020 Poster: Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous GNNs »
Hao Tang · Zhiao Huang · Jiayuan Gu · Bao-Liang Lu · Hao Su -
2020 Poster: Online Adaptation for Consistent Mesh Reconstruction in the Wild »
Xueting Li · Sifei Liu · Shalini De Mello · Kihwan Kim · Xiaolong Wang · Ming-Hsuan Yang · Jan Kautz -
2020 Poster: Multi-task Batch Reinforcement Learning with Metric Learning »
Jiachen Li · Quan Vuong · Shuang Liu · Minghua Liu · Kamil Ciosek · Henrik Christensen · Hao Su -
2020 Poster: Multi-Task Reinforcement Learning with Soft Modularization »
Ruihan Yang · Huazhe Xu · YI WU · Xiaolong Wang -
2020 Poster: MOReL: Model-Based Offline Reinforcement Learning »
Rahul Kidambi · Aravind Rajeswaran · Praneeth Netrapalli · Thorsten Joachims -
2020 Poster: Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals »
Tongzhou Mu · Jiayuan Gu · Zhiwei Jia · Hao Tang · Hao Su -
2019 Poster: Meta-Learning with Implicit Gradients »
Aravind Rajeswaran · Chelsea Finn · Sham Kakade · Sergey Levine -
2018 Poster: Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions »
Minhyuk Sung · Hao Su · Ronald Yu · Leonidas Guibas -
2017 Poster: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space »
Charles Ruizhongtai Qi · Li Yi · Hao Su · Leonidas Guibas -
2017 Poster: Towards Generalization and Simplicity in Continuous Control »
Aravind Rajeswaran · Kendall Lowrey · Emanuel Todorov · Sham Kakade -
2016 Poster: FPNN: Field Probing Neural Networks for 3D Data »
Yangyan Li · Soeren Pirk · Hao Su · Charles R Qi · Leonidas Guibas -
2010 Poster: Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification »
Li-Jia Li · Hao Su · Eric Xing · Li Fei-Fei