Timezone: »
We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning. This provides one way of leveraging and combining the advantages of model-free and model-based approaches. Specifically, we differentiate through MPC by using the KKT conditions of the convex approximation at a fixed point of the controller. Using this strategy, we are able to learn the cost and dynamics of a controller via end-to-end learning. Our experiments focus on imitation learning in the pendulum and cartpole domains, where we learn the cost and dynamics terms of an MPC policy class. We show that our MPC policies are significantly more data-efficient than a generic neural network and that our method is superior to traditional system identification in a setting where the expert is unrealizable.
Author Information
Brandon Amos (Carnegie Mellon University)
Ivan Jimenez (Georgia Tech)
Jacob I Sacks (Georgia Institute of Technology)
I am a PhD student in the Electrical and Computer Engineering department at Georgia Tech working in the Robot Learning Lab under Dr. Byron Boots. My primary research interests are in machine learning, optimal control theory, imitation/reinforcement learning, and connections between these disciplines. Additionally, I am interested in hardware acceleration for machine learning and robotics and neuro-inspired compute systems. Prior to joining Georgia Tech, I received my BS in Biomedical Engineering from UT Austin.
Byron Boots (Georgia Tech / Google Brain)
J. Zico Kolter (Carnegie Mellon University / Bosch Center for AI)
Zico Kolter is an Assistant Professor in the School of Computer Science at Carnegie Mellon University, and also serves as Chief Scientist of AI Research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, explainable, and rigorous methods in deep learning. In addition, he has worked on a number of application areas, highlighted by work on sustainability and smart energy systems. He is the recipient of the DARPA Young Faculty Award, and best paper awards at KDD, IJCAI, and PESGM.
More from the Same Authors
-
2020 : An adversarially robust approach to security-constrained optimal power flow »
Neeraj Vijay Bedmutha · Priya Donti · J. Zico Kolter -
2021 : Cross-Domain Imitation Learning via Optimal Transport »
Arnaud Fickinger · Samuel Cohen · Stuart Russell · Brandon Amos -
2021 : Imitation Learning from Pixel Observations for Continuous Control »
Samuel Cohen · Brandon Amos · Marc Deisenroth · Mikael Henaff · Eugene Vinitsky · Denis Yarats -
2021 : Input Convex Gradient Networks »
Jack Richter-Powell · Jonathan Lorraine · Brandon Amos -
2021 : Input Convex Gradient Networks »
Jack Richter-Powell · Jonathan Lorraine · Brandon Amos -
2021 : Sliced Multi-Marginal Optimal Transport »
Samuel Cohen · Alexander Terenin · Yannik Pitcan · Brandon Amos · Marc Deisenroth · Senanayak Sesh Kumar Karri -
2022 : Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation »
Melrose Roderick · Felix Berkenkamp · Fatemeh Sheikholeslami · J. Zico Kolter -
2022 : Denoised Smoothing with Sample Rejection for Robustifying Pretrained Classifiers »
Fatemeh Sheikholeslami · Wan-Yi Lin · Jan Hendrik Metzen · Huan Zhang · J. Zico Kolter -
2022 : A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games »
Samuel Sokota · Ryan D'Orazio · J. Zico Kolter · Nicolas Loizou · Marc Lanctot · Ioannis Mitliagkas · Noam Brown · Christian Kroer -
2022 : Uncertainty-Driven Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2022 : Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes »
Sina Baharlouei · Fatemeh Sheikholeslami · Meisam Razaviyayn · J. Zico Kolter -
2022 Workshop: Trustworthy and Socially Responsible Machine Learning »
Huan Zhang · Linyi Li · Chaowei Xiao · J. Zico Kolter · Anima Anandkumar · Bo Li -
2022 : Zico Kolter, Adapt like you train: How optimization at training time affects model finetuning and adaptation »
J. Zico Kolter -
2022 Poster: Characterizing Datapoints via Second-Split Forgetting »
Pratyush Maini · Saurabh Garg · Zachary Lipton · J. Zico Kolter -
2022 Poster: Learning Options via Compression »
Yiding Jiang · Evan Liu · Benjamin Eysenbach · J. Zico Kolter · Chelsea Finn -
2022 Poster: Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation »
Zhouxing Shi · Yihan Wang · Huan Zhang · J. Zico Kolter · Cho-Jui Hsieh -
2022 Poster: Test Time Adaptation via Conjugate Pseudo-labels »
Sachin Goyal · Mingjie Sun · Aditi Raghunathan · J. Zico Kolter -
2022 Poster: Deep Equilibrium Approaches to Diffusion Models »
Ashwini Pokle · Zhengyang Geng · J. Zico Kolter -
2022 Poster: Agreement-on-the-line: Predicting the Performance of Neural Networks under Distribution Shift »
Christina Baek · Yiding Jiang · Aditi Raghunathan · J. Zico Kolter -
2022 Poster: General Cutting Planes for Bound-Propagation-Based Neural Network Verification »
Huan Zhang · Shiqi Wang · Kaidi Xu · Linyi Li · Bo Li · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2022 Poster: The Pitfalls of Regularization in Off-Policy TD Learning »
Gaurav Manek · J. Zico Kolter -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Enforcing Robustness for Neural Network Policies »
J. Zico Kolter -
2021 Poster: Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification »
Shiqi Wang · Huan Zhang · Kaidi Xu · Xue Lin · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2021 Poster: Joint inference and input optimization in equilibrium networks »
Swaminathan Gurumurthy · Shaojie Bai · Zachary Manchester · J. Zico Kolter -
2021 Poster: Scalable Online Planning via Reinforcement Learning Fine-Tuning »
Arnaud Fickinger · Hengyuan Hu · Brandon Amos · Stuart Russell · Noam Brown -
2021 Poster: $(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations »
Zhichun Huang · Shaojie Bai · J. Zico Kolter -
2021 Poster: Boosted CVaR Classification »
Runtian Zhai · Chen Dan · Arun Suggala · J. Zico Kolter · Pradeep Ravikumar -
2021 Poster: Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds »
Yujia Huang · Huan Zhang · Yuanyuan Shi · J. Zico Kolter · Anima Anandkumar -
2021 Poster: Adversarially robust learning for security-constrained optimal power flow »
Priya Donti · Aayushya Agarwal · Neeraj Vijay Bedmutha · Larry Pileggi · J. Zico Kolter -
2021 Poster: Robustness between the worst and average case »
Leslie Rice · Anna Bair · Huan Zhang · J. Zico Kolter -
2021 Poster: Monte Carlo Tree Search With Iteratively Refining State Abstractions »
Samuel Sokota · Caleb Y Ho · Zaheen Ahmad · J. Zico Kolter -
2020 : Invited Talk (Zico Kolter) »
J. Zico Kolter -
2020 Workshop: Machine Learning for Engineering Modeling, Simulation and Design »
Alex Beatson · Priya Donti · Amira Abdel-Rahman · Stephan Hoyer · Rose Yu · J. Zico Kolter · Ryan Adams -
2020 : Deep Riemannian Manifold Learning »
Aaron Lou · Maximilian Nickel · Brandon Amos -
2020 : Keynote by Zico Kolter »
J. Zico Kolter -
2020 Poster: Community detection using fast low-cardinality semidefinite programming
 »
Po-Wei Wang · J. Zico Kolter -
2020 Poster: Deep Archimedean Copulas »
Chun Kai Ling · Fei Fang · J. Zico Kolter -
2020 Tutorial: (Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization Q&A »
David Duvenaud · J. Zico Kolter · Matthew Johnson -
2020 Poster: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Spotlight: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Poster: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2020 Poster: Denoised Smoothing: A Provable Defense for Pretrained Classifiers »
Hadi Salman · Mingjie Sun · Greg Yang · Ashish Kapoor · J. Zico Kolter -
2020 Poster: Monotone operator equilibrium networks »
Ezra Winston · J. Zico Kolter -
2020 Spotlight: Monotone operator equilibrium networks »
Ezra Winston · J. Zico Kolter -
2020 Oral: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2020 Tutorial: (Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization »
David Duvenaud · J. Zico Kolter · Matthew Johnson -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 Poster: Learning Stable Deep Dynamics Models »
J. Zico Kolter · Gaurav Manek -
2019 Poster: Adversarial Music: Real world Audio Adversary against Wake-word Detection System »
Juncheng Li · Shuhui Qu · Xinjian Li · Joseph Szurley · J. Zico Kolter · Florian Metze -
2019 Spotlight: Adversarial Music: Real world Audio Adversary against Wake-word Detection System »
Juncheng Li · Shuhui Qu · Xinjian Li · Joseph Szurley · J. Zico Kolter · Florian Metze -
2019 Poster: Differentiable Convex Optimization Layers »
Akshay Agrawal · Brandon Amos · Shane Barratt · Stephen Boyd · Steven Diamond · J. Zico Kolter -
2019 Poster: Uniform convergence may be unable to explain generalization in deep learning »
Vaishnavh Nagarajan · J. Zico Kolter -
2019 Poster: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2019 Spotlight: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2019 Oral: Uniform convergence may be unable to explain generalization in deep learning »
Vaishnavh Nagarajan · J. Zico Kolter -
2018 : Byron Boots »
Byron Boots -
2018 : Talk 1: Zico Kolter - Differentiable Physics and Control »
J. Zico Kolter -
2018 Poster: Learning and Inference in Hilbert Space with Quantum Graphical Models »
Siddarth Srinivasan · Carlton Downey · Byron Boots -
2018 Poster: Dual Policy Iteration »
Wen Sun · Geoffrey Gordon · Byron Boots · J. Bagnell -
2018 Poster: Depth-Limited Solving for Imperfect-Information Games »
Noam Brown · Tuomas Sandholm · Brandon Amos -
2018 Poster: End-to-End Differentiable Physics for Learning and Control »
Filipe de Avila Belbute Peres · Kevin Smith · Kelsey Allen · Josh Tenenbaum · J. Zico Kolter -
2018 Poster: Orthogonally Decoupled Variational Gaussian Processes »
Hugh Salimbeni · Ching-An Cheng · Byron Boots · Marc Deisenroth -
2018 Spotlight: End-to-End Differentiable Physics for Learning and Control »
Filipe de Avila Belbute Peres · Kevin Smith · Kelsey Allen · Josh Tenenbaum · J. Zico Kolter -
2018 Poster: Scaling provable adversarial defenses »
Eric Wong · Frank Schmidt · Jan Hendrik Metzen · J. Zico Kolter -
2018 Tutorial: Adversarial Robustness: Theory and Practice »
J. Zico Kolter · Aleksander Madry -
2017 : Provable defenses against adversarial examples via the convex outer adversarial polytope »
J. Zico Kolter -
2017 Poster: Gradient descent GAN optimization is locally stable »
Vaishnavh Nagarajan · J. Zico Kolter -
2017 Oral: Gradient descent GAN optimization is locally stable »
Vaishnavh Nagarajan · J. Zico Kolter -
2017 Poster: Predictive-State Decoders: Encoding the Future into Recurrent Networks »
Arun Venkatraman · Nicholas Rhinehart · Wen Sun · Lerrel Pinto · Martial Hebert · Byron Boots · Kris Kitani · J. Bagnell -
2017 Poster: Variational Inference for Gaussian Process Models with Linear Complexity »
Ching-An Cheng · Byron Boots -
2017 Poster: Task-based End-to-end Model Learning in Stochastic Optimization »
Priya Donti · J. Zico Kolter · Brandon Amos -
2017 Poster: Predictive State Recurrent Neural Networks »
Carlton Downey · Ahmed Hefny · Byron Boots · Geoffrey Gordon · Boyue Li -
2016 Poster: The Multiple Quantile Graphical Model »
Alnur Ali · J. Zico Kolter · Ryan Tibshirani -
2016 Poster: Incremental Variational Sparse Gaussian Process Regression »
Ching-An Cheng · Byron Boots -
2013 Workshop: Machine Learning for Sustainability »
Edwin Bonilla · Thomas Dietterich · Theodoros Damoulas · Andreas Krause · Daniel Sheldon · Iadine Chades · J. Zico Kolter · Bistra Dilkina · Carla Gomes · Hugo P Simao -
2011 Workshop: Machine Learning for Sustainability »
Thomas Dietterich · J. Zico Kolter · Matthew A Brown -
2011 Poster: The Fixed Points of Off-Policy TD »
J. Zico Kolter -
2011 Spotlight: The Fixed Points of Off-Policy TD »
J. Zico Kolter -
2010 Poster: Energy Disaggregation via Discriminative Sparse Coding »
J. Zico Kolter · Siddarth Batra · Andrew Y Ng -
2009 Mini Symposium: Machine Learning for Sustainability »
J. Zico Kolter · Thomas Dietterich · Andrew Y Ng -
2007 Spotlight: Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion »
J. Zico Kolter · Pieter Abbeel · Andrew Y Ng -
2007 Poster: Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion »
J. Zico Kolter · Pieter Abbeel · Andrew Y Ng