Timezone: »
We consider how to most efficiently leverage teleoperator time to collect data for learning robust image-based value functions and policies for sparse reward robotic tasks. To accomplish this goal, we modify the process of data collection to include more than just successful demonstrations of the desired task. Instead we develop a novel protocol that we call Visual Backtracking Teleoperation (VBT), which deliberately collects a dataset of visually similar failures, recoveries, and successes. VBT data collection is particularly useful for efficiently learning accurate value functions from small datasets of image-based observations. We demonstrate VBT on a real robot to perform continuous control from image observations for the deformable manipulation task of T-shirt grasping. We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection. Specifically, we find that offline reinforcement learning on VBT data outperforms standard behavior cloning on successful demonstration data by 13% when both methods are given equal-sized datasets of 60 minutes of data from the real robot.
Author Information
David Brandfonbrener (New York University)
Stephen Tu (Google)
Avi Singh (Google Brain)
Stefan Welker (Google)
Chad Boodoo (Google)
Nikolai Matni (University of Pennsylvania)
Jacob Varley (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 : Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based RL »
Dates n/a. Room
More from the Same Authors
-
2021 Spotlight: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2021 : Quantile Filtered Imitation Learning »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2022 : Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based RL »
David Brandfonbrener · Stephen Tu · Avi Singh · Stefan Welker · Chad Boodoo · Nikolai Matni · Jake Varley -
2022 Poster: TaSIL: Taylor Series Imitation Learning »
Daniel Pfrommer · Thomas Zhang · Stephen Tu · Nikolai Matni -
2022 Poster: When does return-conditioned supervised learning work for offline reinforcement learning? »
David Brandfonbrener · Alberto Bietti · Jacob Buckman · Romain Laroche · Joan Bruna -
2021 Poster: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2020 : Contributed Talk 2: Chaining Behaviors from Data with Model-Free Reinforcement Learning »
Avi Singh -
2020 Poster: Ode to an ODE »
Krzysztof Choromanski · Jared Quincy Davis · Valerii Likhosherstov · Xingyou Song · Jean-Jacques Slotine · Jacob Varley · Honglak Lee · Adrian Weller · Vikas Sindhwani -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 : Poster Spotlight 1 »
David Brandfonbrener · Joan Bruna · Tom Zahavy · Haim Kaplan · Yishay Mansour · Nikos Karampatziakis · John Langford · Paul Mineiro · Donghwan Lee · Niao He -
2018 : Poster Session 1 + Coffee »
Tom Van de Wiele · Rui Zhao · J. Fernando Hernandez-Garcia · Fabio Pardo · Xian Yeow Lee · Xiaolin Andy Li · Marcin Andrychowicz · Jie Tang · Suraj Nair · Juhyeon Lee · Cédric Colas · S. M. Ali Eslami · Yen-Chen Wu · Stephen McAleer · Ryan Julian · Yang Xue · Matthia Sabatelli · Pranav Shyam · Alexandros Kalousis · Giovanni Montana · Emanuele Pesce · Felix Leibfried · Zhanpeng He · Chunxiao Liu · Yanjun Li · Yoshihide Sawada · Alexander Pashevich · Tejas Kulkarni · Keiran Paster · Luca Rigazio · Quan Vuong · Hyunggon Park · Minhae Kwon · Rivindu Weerasekera · Shamane Siriwardhanaa · Rui Wang · Ozsel Kilinc · Keith Ross · Yizhou Wang · Simon Schmitt · Thomas Anthony · Evan Cater · Forest Agostinelli · Tegg Sung · Shirou Maruyama · Alexander Shmakov · Devin Schwab · Mohammad Firouzi · Glen Berseth · Denis Osipychev · Jesse Farebrother · Jianlan Luo · William Agnew · Peter Vrancx · Jonathan Heek · Catalin Ionescu · Haiyan Yin · Megumi Miyashita · Nathan Jay · Noga H. Rotman · Sam Leroux · Shaileshh Bojja Venkatakrishnan · Henri Schmidt · Jack Terwilliger · Ishan Durugkar · Jonathan Sauder · David Kas · Arash Tavakoli · Alain-Sam Cohen · Philip Bontrager · Adam Lerer · Thomas Paine · Ahmed Khalifa · Ruben Rodriguez · Avi Singh · Yiming Zhang -
2018 Poster: Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition »
Justin Fu · Avi Singh · Dibya Ghosh · Larry Yang · Sergey Levine