Timezone: »
Imitation Learning (IL) has been successfully applied to complex sequential decision-making problems where standard Reinforcement Learning (RL) algorithms fail. A number of recent methods extend IL to few-shot learning scenarios, where a meta-trained policy learns to quickly master new tasks using limited demonstrations. However, although Inverse Reinforcement Learning (IRL) often outperforms Behavioral Cloning (BC) in terms of imitation quality, most of these approaches build on BC due to its simple optimization objective. In this work, we propose SMILe, a scalable framework for Meta Inverse Reinforcement Learning (Meta-IRL) based on maximum entropy IRL, which can learn high-quality policies from few demonstrations. We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC. Furthermore, we observe that SMILe performs comparably or outperforms Meta-DAgger, while being applicable in the state-only setting and not requiring online experts. To our knowledge, our approach is the first efficient method for Meta-IRL that scales to the function approximator setting. For datasets and reproducing results please refer to https://github.com/KamyarGh/rlswiss/blob/master/reproducing/smilepaper.md .
Author Information
Kamyar Ghasemipour (University of Toronto, Vector Institute)
Shixiang (Shane) Gu (Google Brain)
Richard Zemel (Vector Institute/University of Toronto)
More from the Same Authors
-
2021 : Understanding Post-hoc Adaptation for Improving Subgroup Robustness »
David Madras · Richard Zemel -
2021 : Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data »
Sindy Löwe · David Madras · Richard Zemel · Max Welling -
2021 : Why so pessimistic? Estimating uncertainties for offline rl through ensembles, and why their independence matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2023 Poster: Distribution-Free Statistical Dispersion Control for Societal Applications »
Zhun Deng · Thomas Zollo · Jake Snell · Toniann Pitassi · Richard Zemel -
2022 Poster: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding »
Chitwan Saharia · William Chan · Saurabh Saxena · Lala Li · Jay Whang · Remi Denton · Kamyar Ghasemipour · Raphael Gontijo Lopes · Burcu Karagol Ayan · Tim Salimans · Jonathan Ho · David Fleet · Mohammad Norouzi -
2022 Poster: Implications of Model Indeterminacy for Explanations of Automated Decisions »
Marc-Etienne Brunet · Ashton Anderson · Richard Zemel -
2022 Poster: Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2022 Poster: Deep Ensembles Work, But Are They Necessary? »
Taiga Abe · Estefany Kelly Buchanan · Geoff Pleiss · Richard Zemel · John Cunningham -
2021 Workshop: Ecological Theory of Reinforcement Learning: How Does Task Design Influence Agent Learning? »
Manfred Díaz · Hiroki Furuta · Elise van der Pol · Lisa Lee · Shixiang (Shane) Gu · Pablo Samuel Castro · Simon Du · Marc Bellemare · Sergey Levine -
2021 Poster: Variational Model Inversion Attacks »
Kuan-Chieh Wang · YAN FU · Ke Li · Ashish Khisti · Richard Zemel · Alireza Makhzani -
2021 Poster: Identifying and Benchmarking Natural Out-of-Context Prediction Problems »
David Madras · Richard Zemel -
2020 : Contributed talks 5: Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification »
Elliot Creager · David Madras · Richard Zemel -
2020 Poster: Weakly-Supervised Reinforcement Learning for Controllable Behavior »
Lisa Lee · Benjamin Eysenbach · Russ Salakhutdinov · Shixiang (Shane) Gu · Chelsea Finn -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Kamyar Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Incremental Few-Shot Learning with Attention Attractor Networks »
Mengye Ren · Renjie Liao · Ethan Fetaya · Richard Zemel -
2019 Poster: Language as an Abstraction for Hierarchical Deep Reinforcement Learning »
YiDing Jiang · Shixiang (Shane) Gu · Kevin Murphy · Chelsea Finn -
2019 Poster: Efficient Graph Generation with Graph Recurrent Attention Networks »
Renjie Liao · Yujia Li · Yang Song · Shenlong Wang · Will Hamilton · David Duvenaud · Raquel Urtasun · Richard Zemel -
2018 Poster: Learning Latent Subspaces in Variational Autoencoders »
Jack Klys · Jake Snell · Richard Zemel -
2018 Poster: Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer »
David Madras · Toni Pitassi · Richard Zemel -
2018 Poster: Neural Guided Constraint Logic Programming for Program Synthesis »
Lisa Zhang · Gregory Rosenblatt · Ethan Fetaya · Renjie Liao · William Byrd · Matthew Might · Raquel Urtasun · Richard Zemel -
2017 : Contributed talk: Predict Responsibly: Increasing Fairness by Learning To Defer Abstract »
David Madras · Richard Zemel · Toni Pitassi -
2017 Poster: Dualing GANs »
Yujia Li · Alex Schwing · Kuan-Chieh Wang · Richard Zemel -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris Mooij · David Sontag · Richard Zemel · Max Welling -
2017 Spotlight: Dualing GANs »
Yujia Li · Alex Schwing · Kuan-Chieh Wang · Richard Zemel -
2017 Poster: Few-Shot Learning Through an Information Retrieval Lens »
Eleni Triantafillou · Richard Zemel · Raquel Urtasun -
2017 Poster: Prototypical Networks for Few-shot Learning »
Jake Snell · Kevin Swersky · Richard Zemel -
2016 Poster: Understanding the Effective Receptive Field in Deep Convolutional Neural Networks »
Wenjie Luo · Yujia Li · Raquel Urtasun · Richard Zemel -
2016 Poster: Learning Deep Parsimonious Representations »
Renjie Liao · Alex Schwing · Richard Zemel · Raquel Urtasun -
2015 Poster: Skip-Thought Vectors »
Jamie Kiros · Yukun Zhu · Russ Salakhutdinov · Richard Zemel · Raquel Urtasun · Antonio Torralba · Sanja Fidler -
2015 Poster: Exploring Models and Data for Image Question Answering »
Mengye Ren · Jamie Kiros · Richard Zemel -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2014 Poster: A Multiplicative Model for Learning Distributed Text-Based Attribute Representations »
Jamie Kiros · Richard Zemel · Russ Salakhutdinov -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2013 Poster: A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data »
Jasper Snoek · Richard Zemel · Ryan Adams -
2013 Poster: On the Expressive Power of Restricted Boltzmann Machines »
James Martens · Arkadev Chattopadhya · Toni Pitassi · Richard Zemel -
2012 Poster: Collaborative Ranking With 17 Parameters »
Maksims Volkovs · Richard Zemel -
2012 Poster: Bayesian n-Choose-k Models for Classification and Ranking »
Kevin Swersky · Danny Tarlow · Richard Zemel · Ryan Adams · Brendan J Frey -
2012 Poster: Efficient Sampling for Bipartite Matching Problems »
Maksims Volkovs · Richard Zemel -
2012 Poster: Cardinality Restricted Boltzmann Machines »
Kevin Swersky · Danny Tarlow · Ilya Sutskever · Richard Zemel · Russ Salakhutdinov · Ryan Adams -
2010 Talk: Opening Remarks and Awards »
Richard Zemel · Terrence Sejnowski · John Shawe-Taylor -
2009 Placeholder: Opening Remarks »
Richard Zemel -
2008 Poster: Comparing model predictions of response bias and variance in cue combination »
Rama Natarajan · Iain Murray · Ladan Shams · Richard Zemel -
2008 Poster: Learning Hybrid Models for Image Annotation with Partially Labeled Data »
Xuming He · Richard Zemel -
2008 Poster: Competing RBM density models for classification of fMRI images »
Tanya Schmah · Geoffrey E Hinton · Richard Zemel