Timezone: »
A practical approach to learning robot skills, often termed sim2real, is to train control policies in simulation and then deploy them on a real robot. Popular techniques to improve the sim2real transfer build on domain randomization (DR) - training the policy on a diverse set of randomly randomly generated domains with the hope of better generalization to the real world. Due to the large number of hyper-parameters in both the policy learning and DR algorithms, one often ends up with a large number of trained models, where choosing the best model among them demands costly evaluation on the real robot. In this work we ask - Can we rank the policies without running them in the real world? Our main idea is that a predefined set of real world data can be used to evaluate all policies, using out-of-distribution detection (OOD) techniques. In a sense, this approach can be seen as a "unit test" to evaluate policies before any real world execution. However, we find that by itself, the OOD score can be inaccurate and very sensitive to the particular OOD method. Our main contribution is a simple-yet-effective policy score that combines OOD with an evaluation in simulation. We show that our score - VSDR - can significantly improve the accuracy of policy ranking without requiring additional real world data. We evaluate the effectiveness of VSDR on sim2real transfer in a robotic grasping task with image inputs. We extensively evaluate different DR parameters and OOD methods, and show that VSDR improves policy selection across the board. More importantly, our method achieves significantly better ranking, and uses significantly less data compared to baselines.
Author Information
Guy Jacob (Intel Labs)
Gal Leibovich (Intel Labs)
Shadi Endrawis (Intel Labs)
Gal Novik (Intel Labs)
Aviv Tamar (Technion)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : Validate on Sim, Detect on Real - Model Selection for Domain Randomization »
Dates n/a. Room
More from the Same Authors
-
2021 : Deep Variational Semi-Supervised Novelty Detection »
Tal Daniel · Thanard Kurutach · Aviv Tamar -
2021 : Deep Variational Semi-Supervised Novelty Detection »
Tal Daniel · Thanard Kurutach · Aviv Tamar -
2022 : Learning Control by Iterative Inversion »
Gal Leibovich · Guy Jacob · Or Avner · Gal Novik · Aviv Tamar -
2022 : Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning »
Dan Elbaz · Gal Novik · Oren Salzman -
2023 Poster: Explore to Generalize in Zero-Shot RL »
Ev Zisselman · Itai Lavie · Daniel Soudry · Aviv Tamar -
2023 Workshop: Generalization in Planning (GenPlan '23) »
Pulkit Verma · Siddharth Srivastava · Aviv Tamar · Felipe Trevizan -
2022 Poster: Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach »
Zohar Rimon · Aviv Tamar · Gilad Adler -
2021 : Q&A for Aviv Tamar »
Aviv Tamar -
2021 : Spotlights »
Hager Radi · Krishan Rana · Yunzhu Li · Shuang Li · Gal Leibovich · Guy Jacob · Ruihan Yang -
2021 : Learning to Explore From Data »
Aviv Tamar -
2021 Poster: Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies »
Ron Dorfman · Idan Shenfeld · Aviv Tamar -
2021 Poster: Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias »
Raanan Rohekar · Shami Nisimov · Yaniv Gurwicz · Gal Novik -
2020 : Mini-panel discussion 1 - Bridging the gap between theory and practice »
Aviv Tamar · Emma Brunskill · Jost Tobias Springenberg · Omer Gottesman · Daniel Mankowitz -
2020 : Keynote: Aviv Tamar »
Aviv Tamar -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · RaphaĆ«l Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 Poster: Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections »
Raanan Rohekar · Yaniv Gurwicz · Shami Nisimov · Gal Novik -
2018 Poster: Bayesian Structure Learning by Recursive Bootstrap »
Raanan Y. Rohekar · Yaniv Gurwicz · Shami Nisimov · Guy Koren · Gal Novik -
2018 Poster: Constructing Deep Neural Networks by Bayesian Network Structure Learning »
Raanan Rohekar · Shami Nisimov · Yaniv Gurwicz · Guy Koren · Gal Novik -
2018 Poster: Learning Plannable Representations with Causal InfoGAN »
Thanard Kurutach · Aviv Tamar · Ge Yang · Stuart Russell · Pieter Abbeel -
2017 Poster: Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments »
Ryan Lowe · YI WU · Aviv Tamar · Jean Harb · OpenAI Pieter Abbeel · Igor Mordatch -
2017 Poster: Shallow Updates for Deep Reinforcement Learning »
Nir Levine · Tom Zahavy · Daniel J Mankowitz · Aviv Tamar · Shie Mannor -
2016 Poster: Value Iteration Networks »
Aviv Tamar · Sergey Levine · Pieter Abbeel · YI WU · Garrett Thomas -
2016 Oral: Value Iteration Networks »
Aviv Tamar · Sergey Levine · Pieter Abbeel · YI WU · Garrett Thomas