Timezone: »
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limits the scalability to continuous, high-dimensional, and long-horizon tasks. In contrast, research in classical robotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations. These techniques, however, lack the flexibility and generalizability provided by deep learning or deep reinforcement learning and have remained under-explored in such settings. In this work, we begin to close this gap and embed dynamics structure into deep neural network-based policies by reparameterizing action spaces with differential equations. We propose Neural Dynamic Policies (NPDs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where action represents the raw control space. The embedded structure allows us to perform end-to-end policy learning under both reinforcement and imitation learning setups. We show that NDPs achieve better or comparable performance to state-of-the-art approaches on many robotic control tasks using both reward-based training and demonstrations. Project video and code are available at: https://shikharbahl.github.io/neural-dynamic-policies/.
Author Information
Shikhar Bahl (Carnegie Mellon University)
Mustafa Mukadam (Facebook AI Research)
Abhinav Gupta (Facebook AI Research/CMU)
Deepak Pathak (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Spotlight: Neural Dynamic Policies for End-to-End Sensorimotor Learning »
Thu Dec 10th 03:10 -- 03:20 PM Room Orals & Spotlights: Reinforcement Learning
More from the Same Authors
-
2020 Poster: Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases »
Senthil Purushwalkam · Abhinav Gupta -
2020 Session: Orals & Spotlights Track 14: Reinforcement Learning »
Deepak Pathak · Martha White -
2020 Poster: Sparse Graphical Memory for Robust Planning »
Scott Emmons · Ajay Jain · Misha Laskin · Thanard Kurutach · Pieter Abbeel · Deepak Pathak -
2020 Poster: See, Hear, Explore: Curiosity via Audio-Visual Association »
Victoria Dean · Shubham Tulsiani · Abhinav Gupta -
2020 Poster: Object Goal Navigation using Goal-Oriented Semantic Exploration »
Devendra Singh Chaplot · Dhiraj Prakashchand Gandhi · Abhinav Gupta · Russ Salakhutdinov -
2019 Poster: Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller »
Pratyusha Sharma · Deepak Pathak · Abhinav Gupta -
2018 Workshop: Imitation Learning and its Challenges in Robotics »
Mustafa Mukadam · Sanjiban Choudhury · Siddhartha Srinivasa -
2018 Poster: Hardware Conditioned Policies for Multi-Robot Transfer Learning »
Tao Chen · Adithyavairavan Murali · Abhinav Gupta -
2018 Poster: Beyond Grids: Learning Graph Representations for Visual Recognition »
Yin Li · Abhinav Gupta -
2018 Poster: Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias »
Abhinav Gupta · Adithyavairavan Murali · Dhiraj Prakashchand Gandhi · Lerrel Pinto -
2013 Poster: Mid-level Visual Element Discovery as Discriminative Mode Seeking »
Carl Doersch · Abhinav Gupta · Alexei A Efros -
2010 Poster: Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces »
David C Lee · Abhinav Gupta · Martial Hebert · Takeo Kanade -
2008 Poster: A "Shape Aware" Model for semi-supervised Learning of Objects and its Context »
Abhinav Gupta · Jianbo Shi · Larry Davis -
2008 Spotlight: A "Shape Aware'' Model for semi-supervised Learning of Objects and its Context »
Abhinav Gupta · Jianbo Shi · Larry Davis