Timezone: »
Reinforcement learning is difficult to apply to real world problems due to high sample complexity, the need to adapt to frequent distribution shifts and the complexities of learning from high-dimensional inputs, such as images. Over the last several years, meta-learning has emerged as a promising approach to tackle these problems by explicitly training an agent to quickly adapt to new tasks. However, such methods still require huge amounts of data during training and are difficult to optimize in high-dimensional domains. One potential solution is to consider offline or batch meta-reinforcement learning (RL) - learning from existing datasets without additional environment interactions during training. In this work we develop the first offline model-based meta-RL algorithm that operates from images in tasks with sparse rewards. Our approach has three main components: a novel strategy to construct meta-exploration trajectories from offline data, which allows agents to learn meaningful meta-test time task inference strategy; representation learning via variational filtering and latent conservative model-free policy optimization. We show that our method completely solves a realistic meta-learning task involving robot manipulation, while naive combinations of previous approaches fail.
Author Information
Rafael Rafailov (Stanford University)
Tianhe Yu (Stanford University)
Avi Singh (UC Berkeley)
Mariano Phielipp (Intel AI Labs)
Dr. Mariano Phielipp works at the Intel AI Lab inside the Intel Artificial Intelligence Products Group. His work includes research and development in deep learning, deep reinforcement learning, machine learning, and artificial intelligence. Since joining Intel, Dr. Phielipp has developed and worked on Computer Vision, Face Recognition, Face Detection, Object Categorization, Recommendation Systems, Online Learning, Automatic Rule Learning, Natural Language Processing, Knowledge Representation, Energy Based Algorithms, and other Machine Learning and AI-related efforts. Dr. Phielipp has also contributed to different disclosure committees, won an Intel division award related to Robotics, and has a large number of patents and pending patents. He has published on NeuriPS, ICML, ICLR, AAAI, IROS, IEEE, SPIE, IASTED, and EUROGRAPHICS-IEEE Conferences and Workshops.
Chelsea Finn (Stanford)
More from the Same Authors
-
2021 Spotlight: Efficiently Identifying Task Groupings for Multi-Task Learning »
Chris Fifty · Ehsan Amid · Zhe Zhao · Tianhe Yu · Rohan Anil · Chelsea Finn -
2021 : Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations »
Michael Zhang · Nimit Sohoni · Hongyang Zhang · Chelsea Finn · Christopher RĂ© -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2021 : Test Time Robustification of Deep Models via Adaptation and Augmentation »
Marvin Zhang · Sergey Levine · Chelsea Finn -
2021 : Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning »
Tianhe Yu · Aviral Kumar · Yevgen Chebotar · Chelsea Finn · Sergey Levine · Karol Hausman -
2021 : CoMPS: Continual Meta Policy Search »
Glen Berseth · Zhiwei Zhang · Grace Zhang · Chelsea Finn · Sergey Levine -
2021 : Discriminator Augmented Model-Based Reinforcement Learning »
Allan Zhou · Archit Sharma · Chelsea Finn -
2021 : Example-Based Offline Reinforcement Learning without Rewards »
Kyle Hatch · Tianhe Yu · Rafael Rafailov · Chelsea Finn -
2021 : The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks »
Rafael Rafailov · · Tianhe Yu · Avi Singh · Mariano Phielipp · Chelsea Finn -
2022 : Contrastive Example-Based Control »
Kyle Hatch · Sarthak J Shetty · Benjamin Eysenbach · Tianhe Yu · Rafael Rafailov · Russ Salakhutdinov · Sergey Levine · Chelsea Finn -
2022 : Offline Policy Comparison with Confidence: Benchmarks and Baselines »
Anurag Koul · Mariano Phielipp · Alan Fern -
2022 : Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based RL »
David Brandfonbrener · Stephen Tu · Avi Singh · Stefan Welker · Chad Boodoo · Nikolai Matni · Jake Varley -
2022 : Group SELFIES: A Robust Fragment-Based Molecular String Representation »
Austin Cheng · Andy Cai · Santiago Miret · Gustavo Malkomes · Mariano Phielipp · Alan Aspuru-Guzik -
2022 : Conformer Search Using SE3-Transformers and Imitation Learning »
Luca Thiede · Santiago Miret · Krzysztof Sadowski · Haoping Xu · Mariano Phielipp · Alan Aspuru-Guzik -
2022 : Contrastive Example-Based Control »
Kyle Hatch · Sarthak J Shetty · Benjamin Eysenbach · Tianhe Yu · Rafael Rafailov · Russ Salakhutdinov · Sergey Levine · Chelsea Finn -
2021 : Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization »
Santiago Miret · Vui Seng Chua · Mattias Marder · Mariano Phielipp · Nilesh Jain · Somdeb Majumdar -
2021 Poster: Visual Adversarial Imitation Learning using Variational Models »
Rafael Rafailov · Tianhe Yu · Aravind Rajeswaran · Chelsea Finn -
2021 Poster: Efficiently Identifying Task Groupings for Multi-Task Learning »
Chris Fifty · Ehsan Amid · Zhe Zhao · Tianhe Yu · Rohan Anil · Chelsea Finn -
2021 Poster: COMBO: Conservative Offline Model-Based Policy Optimization »
Tianhe Yu · Aviral Kumar · Rafael Rafailov · Aravind Rajeswaran · Sergey Levine · Chelsea Finn -
2021 Poster: Conservative Data Sharing for Multi-Task Offline Reinforcement Learning »
Tianhe Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Sergey Levine · Chelsea Finn -
2020 : Mini-panel discussion 3 - Prioritizing Real World RL Challenges »
Chelsea Finn · Thomas Dietterich · Angela Schoellig · Anca Dragan · Anusha Nagabandi · Doina Precup -
2020 : Keynote: Chelsea Finn »
Chelsea Finn -
2020 Poster: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2020 Spotlight: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2020 Poster: Instance-based Generalization in Reinforcement Learning »
Martin Bertran · Natalia Martinez · Mariano Phielipp · Guillermo Sapiro -
2019 Poster: Goal-conditioned Imitation Learning »
Yiming Ding · Carlos Florensa · Pieter Abbeel · Mariano Phielipp -
2019 Poster: Language as an Abstraction for Hierarchical Deep Reinforcement Learning »
YiDing Jiang · Shixiang (Shane) Gu · Kevin Murphy · Chelsea Finn -
2017 Demonstration: Deep Robotic Learning using Visual Imagination and Meta-Learning »
Chelsea Finn · Frederik Ebert · Tianhe Yu · Annie Xie · Sudeep Dasari · Pieter Abbeel · Sergey Levine