Timezone: »
Consider learning a policy from example expert behavior, without interaction with the expert or access to a reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.
Author Information
Jonathan Ho (Stanford)
Stefano Ermon (Stanford)
More from the Same Authors
-
2020 Poster: Improved Techniques for Training Score-Based Generative Models »
Yang Song · Stefano Ermon -
2020 Poster: Probabilistic Circuits for Variational Inference in Discrete Graphical Models »
Andy Shih · Stefano Ermon -
2020 Poster: Efficient Learning of Generative Models via Finite-Difference Score Matching »
Tianyu Pang · Kun Xu · Chongxuan LI · Yang Song · Stefano Ermon · Jun Zhu -
2020 Poster: Belief Propagation Neural Networks »
Jonathan Kuck · Shuvam Chakraborty · Hao Tang · Rachel Luo · Jiaming Song · Ashish Sabharwal · Stefano Ermon -
2020 Poster: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré -
2020 Spotlight: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré -
2020 Poster: Autoregressive Score Matching »
Chenlin Meng · Lantao Yu · Yang Song · Jiaming Song · Stefano Ermon -
2020 Poster: Diversity can be Transferred: Output Diversification for White- and Black-box Attacks »
Yusuke Tashiro · Yang Song · Stefano Ermon -
2020 Poster: MOPO: Model-based Offline Policy Optimization »
Tianhe Yu · Garrett Thomas · Lantao Yu · Stefano Ermon · James Zou · Sergey Levine · Chelsea Finn · Tengyu Ma -
2020 Poster: Multi-label Contrastive Predictive Coding »
Jiaming Song · Stefano Ermon -
2020 Oral: Multi-label Contrastive Predictive Coding »
Jiaming Song · Stefano Ermon -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alexandros Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Poster: Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations. »
Sawyer Birnbaum · Volodymyr Kuleshov · Zayd Enam · Pang Wei Koh · Stefano Ermon -
2019 Poster: MintNet: Building Invertible Neural Networks with Masked Convolutions »
Yang Song · Chenlin Meng · Stefano Ermon -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2019 Poster: Meta-Inverse Reinforcement Learning with Probabilistic Context Variables »
Lantao Yu · Tianhe Yu · Chelsea Finn · Stefano Ermon -
2019 Poster: Approximating the Permanent by Sampling from Adaptive Partitions »
Jonathan Kuck · Tri Dao · Hamid Rezatofighi · Ashish Sabharwal · Stefano Ermon -
2019 Poster: Generative Modeling by Estimating Gradients of the Data Distribution »
Yang Song · Stefano Ermon -
2019 Oral: Generative Modeling by Estimating Gradients of the Data Distribution »
Yang Song · Stefano Ermon -
2018 Workshop: Relational Representation Learning »
Aditya Grover · Paroma Varma · Frederic Sala · Christopher Ré · Jennifer Neville · Stefano Ermon · Steven Holtzen -
2018 Poster: Streamlining Variational Inference for Constraint Satisfaction Problems »
Aditya Grover · Tudor Achim · Stefano Ermon -
2018 Poster: Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance »
Neal Jean · Sang Michael Xie · Stefano Ermon -
2018 Poster: Multi-Agent Generative Adversarial Imitation Learning »
Jiaming Song · Hongyu Ren · Dorsa Sadigh · Stefano Ermon -
2018 Poster: Constructing Unrestricted Adversarial Examples with Generative Models »
Yang Song · Rui Shu · Nate Kushman · Stefano Ermon -
2018 Poster: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2018 Spotlight: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2018 Poster: Amortized Inference Regularization »
Rui Shu · Hung Bui · Shengjia Zhao · Mykel J Kochenderfer · Stefano Ermon -
2017 Poster: A-NICE-MC: Adversarial Training for MCMC »
Jiaming Song · Shengjia Zhao · Stefano Ermon -
2017 Poster: InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations »
Yunzhu Li · Jiaming Song · Stefano Ermon -
2017 Poster: Neural Variational Inference and Learning in Undirected Graphical Models »
Volodymyr Kuleshov · Stefano Ermon -
2016 Poster: Solving Marginal MAP Problems with NP Oracles and Parity Constraints »
Yexiang Xue · zhiyuan li · Stefano Ermon · Carla Gomes · Bart Selman -
2016 Poster: Variational Bayes on Monte Carlo Steroids »
Aditya Grover · Stefano Ermon -
2016 Poster: Adaptive Concentration Inequalities for Sequential Decision Problems »
Shengjia Zhao · Enze Zhou · Ashish Sabharwal · Stefano Ermon -
2013 Poster: Embed and Project: Discrete Sampling with Universal Hashing »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2012 Poster: Density Propagation and Improved Bounds on the Partition Function »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2011 Poster: Accelerated Adaptive Markov Chain for Partition Function Computation »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2011 Spotlight: Accelerated Adaptive Markov Chain for Partition Function Computation »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman