Timezone: »
We study the problem of programmatic reinforcement learning, in which policies are represented as short programs in a symbolic language. Programmatic policies can be more interpretable, generalizable, and amenable to formal verification than neural policies; however, designing rigorous learning approaches for such policies remains a challenge. Our approach to this challenge - a meta-algorithm called PROPEL - is based on three insights. First, we view our learning task as optimization in policy space, modulo the constraint that the desired policy has a programmatic representation, and solve this optimization problem using a form of mirror descent that takes a gradient step into the unconstrained policy space and then projects back onto the constrained space. Second, we view the unconstrained policy space as mixing neural and programmatic representations, which enables employing state-of-the-art deep policy gradient approaches. Third, we cast the projection step as program synthesis via imitation learning, and exploit contemporary combinatorial methods for this task. We present theoretical convergence results for PROPEL and empirically evaluate the approach in three continuous control domains. The experiments show that PROPEL can significantly outperform state-of-the-art approaches for learning programmatic policies.
Author Information
Abhinav Verma (Rice University)
Hoang Le (California Institute of Technology)
Yisong Yue (Caltech)
Swarat Chaudhuri (Rice University)
More from the Same Authors
-
2021 : The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions »
Jennifer J Sun · Tomomi Karigo · Dipam Chakraborty · Sharada Mohanty · Benjamin Wild · Quan Sun · Chen Chen · David Anderson · Pietro Perona · Yisong Yue · Ann Kennedy -
2021 : Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning »
Cameron Voloshin · Hoang Le · Nan Jiang · Yisong Yue -
2022 : Neurosymbolic Programming for Science »
Jennifer J Sun · Megan Tjandrasuwita · Atharva Sehgal · Armando Solar-Lezama · Swarat Chaudhuri · Yisong Yue · Omar Costilla Reyes -
2022 : SustainGym: A Benchmark Suite of Reinforcement Learning for Sustainability Applications »
Christopher Yeh · Victor Li · Rajeev Datta · Yisong Yue · Adam Wierman -
2023 Poster: Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations »
Yiheng Lin · James Preiss · Emile Anand · Yingying Li · Yisong Yue · Adam Wierman -
2023 Poster: Compositional Policy Learning in Stochastic Control Systems with Formal Guarantees »
Krishnendu Chatterjee · Thomas Henzinger · Mathias Lechner · Abhinav Verma · Đorđe Žikelić -
2023 Poster: SustainGym: Reinforcement Learning Environments for Sustainable Energy Systems »
Christopher Yeh · Victor Li · Rajeev Datta · Julio Arroyo · Nicolas Christianson · Chi Zhang · Yize Chen · Mohammad Mehdi Hosseini · Azarang Golmohammadi · Yuanyuan Shi · Yisong Yue · Adam Wierman -
2022 : Panel »
Jeevana Priya Inala · Pushmeet Kohli · Ann Kennedy · Sriram Rajamani · Yisong Yue -
2022 : Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings »
Sabera Talukder · Jennifer J Sun · Matthew Leonard · Bingni Brunton · Yisong Yue -
2022 Poster: Policy Optimization with Linear Temporal Logic Constraints »
Cameron Voloshin · Hoang Le · Swarat Chaudhuri · Yisong Yue -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Learning for Agile Control in the Real World: Challenges and Opportunities »
Yisong Yue · Ivan Dario D Jimenez Rodriguez -
2021 Poster: Meta-Adaptive Nonlinear Control: Theory and Algorithms »
Guanya Shi · Kamyar Azizzadenesheli · Michael O'Connell · Soon-Jo Chung · Yisong Yue -
2021 Poster: DeepGEM: Generalized Expectation-Maximization for Blind Inversion »
Angela Gao · Jorge Castellanos · Yisong Yue · Zachary Ross · Katherine Bouman -
2021 Poster: Iterative Amortized Policy Optimization »
Joseph Marino · Alexandre Piche · Alessandro Davide Ialongo · Yisong Yue -
2020 Workshop: Learning Meets Combinatorial Algorithms »
Marin Vlastelica · Jialin Song · Aaron Ferber · Brandon Amos · Georg Martius · Bistra Dilkina · Yisong Yue -
2020 Poster: Online Optimization with Memory and Competitive Control »
Guanya Shi · Yiheng Lin · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2020 Poster: A General Large Neighborhood Search Framework for Solving Integer Linear Programs »
Jialin Song · ravi lanka · Yisong Yue · Bistra Dilkina -
2020 Poster: Learning compositional functions via multiplicative weight updates »
Jeremy Bernstein · Jiawei Zhao · Markus Meister · Ming-Yu Liu · Anima Anandkumar · Yisong Yue -
2020 Poster: Learning Differentiable Programs with Admissible Neural Heuristics »
Ameesh Shah · Eric Zhan · Jennifer J Sun · Abhinav Verma · Yisong Yue · Swarat Chaudhuri -
2020 Poster: Neurosymbolic Reinforcement Learning with Formally Verified Exploration »
Greg Anderson · Abhinav Verma · Isil Dillig · Swarat Chaudhuri -
2020 Poster: On the distance between two neural networks and the stability of learning »
Jeremy Bernstein · Arash Vahdat · Yisong Yue · Ming-Yu Liu -
2020 Poster: The Power of Predictions in Online Control »
Chenkai Yu · Guanya Shi · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2019 Workshop: Safety and Robustness in Decision-making »
Mohammad Ghavamzadeh · Shie Mannor · Yisong Yue · Marek Petrik · Yinlam Chow -
2019 Poster: NAOMI: Non-Autoregressive Multiresolution Sequence Imputation »
Yukai Liu · Rose Yu · Stephan Zheng · Eric Zhan · Yisong Yue -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Landmark Ordinal Embedding »
Nikhil Ghosh · Yuxin Chen · Yisong Yue -
2018 : Yisong Yue »
Yisong Yue -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: A General Method for Amortizing Variational Filtering »
Joseph Marino · Milan Cvitkovic · Yisong Yue -
2018 Poster: HOUDINI: Lifelong Learning as Program Synthesis »
Lazar Valkov · Dipak Chaudhari · Akash Srivastava · Charles Sutton · Swarat Chaudhuri -
2017 : Coffee break and Poster Session II »
Mohamed Kane · Albert Haque · Vagelis Papalexakis · John Guibas · Peter Li · Carlos Arias · Eric Nalisnick · Padhraic Smyth · Frank Rudzicz · Xia Zhu · Theodore Willke · Noemie Elhadad · Hans Raffauf · Harini Suresh · Paroma Varma · Yisong Yue · Ognjen (Oggi) Rudovic · Luca Foschini · Syed Rameel Ahmad · Hasham ul Haq · Valerio Maggio · Giuseppe Jurman · Sonali Parbhoo · Pouya Bashivan · Jyoti Islam · Mirco Musolesi · Chris Wu · Alexander Ratner · Jared Dunnmon · Cristóbal Esteban · Aram Galstyan · Greg Ver Steeg · Hrant Khachatrian · Marc Górriz · Mihaela van der Schaar · Anton Nemchenko · Manasi Patwardhan · Tanay Tandon -
2016 Poster: Generating Long-term Trajectories Using Deep Hierarchical Networks »
Stephan Zheng · Yisong Yue · Patrick Lucey -
2015 Poster: Smooth Interactive Submodular Set Cover »
Bryan He · Yisong Yue -
2015 Demonstration: Data-Driven Speech Animation »
Yisong Yue · Iain Matthews