Timezone: »
Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function. Popular approaches like CEM and CMA-ES greedily focus on promising regions of the search space and may get trapped in local maxima. DOO and VOOT balance exploration and exploitation, but use space partitioning strategies independent of the reward function to be optimized. Recently, LaMCTS empirically learns to partition the search space in a reward-sensitive manner for black-box optimization. In this paper, we develop a novel formal regret analysis for when and why such an adaptive region partitioning scheme works. We also propose a new path planning method LaP3 which improves the function value estimation within each sub-region, and uses a latent representation of the search space. Empirically, LaP3 outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima, and shows benefits when plugged into the planning components of model-based RL such as PETS. These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 39% on average across 9 tasks, and in molecular design by up to 0.4 on properties on a 0-1 scale. Code is available at https://github.com/yangkevin2/neurips2021-lap3.
Author Information
Kevin Yang (UC Berkeley)
Tianjun Zhang (University of California, Berkeley)
Chris Cummins (Meta)
Chris Cummins is a Research Engineer at Facebook’s AI Research. His research focuses on fusing AI techniques with compilers and systems optimization. Before joining Facebook Chris was a postdoc at the University of Edinburgh where he received Ph.D. and MSc degrees. He completed his MEng degree at Aston University. He is the recipient of numerous best paper awards, the SISCA Best Scottish PhD Award, and the Institute of Engineering and Technology Prize.
Brandon Cui (Facebook AI Research)
Benoit Steiner (Facebook AI Research)
Linnan Wang (Brown University)
Joseph Gonzalez (UC Berkeley)
Dan Klein (UC Berkeley)
Yuandong Tian (Facebook AI Research)
More from the Same Authors
-
2020 : Iterative Value Learning for ThroughputOptimization of Deep Learning Workloads »
Benoit Steiner -
2021 : TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers »
Lianmin Zheng · Ruochen Liu · Junru Shao · Tianqi Chen · Joseph Gonzalez · Ion Stoica · Ameer Haj-Ali -
2021 : Effect of Model Size on Worst-group Generalization »
Alan Pham · Eunice Chan · Vikranth Srivatsa · Dhruba Ghosh · Yaoqing Yang · Yaodong Yu · Ruiqi Zhong · Joseph Gonzalez · Jacob Steinhardt -
2021 : C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks »
Tianjun Zhang · Ben Eysenbach · Russ Salakhutdinov · Sergey Levine · Joseph Gonzalez -
2021 : Graph Backup: Data Efficient Backup Exploiting Markovian Data »
zhengyao Jiang · Tianjun Zhang · Robert Kirk · Tim Rocktäschel · Edward Grefenstette -
2022 : Efficient Planning in a Compact Latent Action Space »
zhengyao Jiang · Tianjun Zhang · Michael Janner · Yueying (Lisa) Li · Tim Rocktäschel · Edward Grefenstette · Yuandong Tian -
2022 : Panel RL Implementation »
Xiaolin Ge · Alborz Geramifard · Kence Anderson · Craig Buhr · Robert Nishihara · Yuandong Tian -
2022 Workshop: Machine Learning for Systems »
Neel Kant · Martin Maas · Azade Nova · Benoit Steiner · Xinlei XU · Dan Zhang -
2022 Poster: Off-Team Learning »
Brandon Cui · Hengyuan Hu · Andrei Lupu · Samuel Sokota · Jakob Foerster -
2022 Poster: Self-Explaining Deviations for Coordination »
Hengyuan Hu · Samuel Sokota · David Wu · Anton Bakhtin · Andrei Lupu · Brandon Cui · Jakob Foerster -
2022 Poster: Contrastive Learning as Goal-Conditioned Reinforcement Learning »
Benjamin Eysenbach · Tianjun Zhang · Sergey Levine · Russ Salakhutdinov -
2021 : Closing Remarks »
Jonathan Raiman · Mimee Xu · Martin Maas · Anna Goldie · Azade Nova · Benoit Steiner -
2021 : ML-guided iterative refinement for system optimization »
Yuandong Tian -
2021 : Community Infrastructure for Applying Reinforcement Learning to Compiler Optimizations »
Chris Cummins · Bram Wasti · Brandon Cui · Olivier Teytaud · Benoit Steiner · Yuandong Tian · Hugh Leather -
2021 : Language, Context, and Action: A Semantic Machines View of Conversational AI »
Dan Klein -
2021 : Opening Remarks »
Jonathan Raiman · Anna Goldie · Benoit Steiner · Azade Nova · Martin Maas · Mimee Xu -
2021 Workshop: ML For Systems »
Benoit Steiner · Jonathan Raiman · Martin Maas · Azade Nova · Mimee Xu · Anna Goldie -
2021 Poster: Accelerating Quadratic Optimization with Reinforcement Learning »
Jeffrey Ichnowski · Paras Jain · Bartolomeo Stellato · Goran Banjac · Michael Luo · Francesco Borrelli · Joseph Gonzalez · Ion Stoica · Ken Goldberg -
2021 Poster: Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL »
Charles Packer · Pieter Abbeel · Joseph Gonzalez -
2021 Poster: RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem »
Eric Liang · Zhanghao Wu · Michael Luo · Sven Mika · Joseph Gonzalez · Ion Stoica -
2021 Poster: Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages »
Xinyun Chen · Dawn Song · Yuandong Tian -
2021 : Machine Learning for Combinatorial Optimization + Q&A »
Maxime Gasse · Simon Bowly · Chris Cameron · Quentin Cappart · Jonas Charfreitag · Laurent Charlin · Shipra Agrawal · Didier Chetelat · Justin Dumouchelle · Ambros Gleixner · Aleksandr Kazachkov · Elias Khalil · Pawel Lichocki · Andrea Lodi · Miles Lubin · Christopher Morris · Dimitri Papageorgiou · Augustin Parjadis · Sebastian Pokutta · Antoine Prouvost · Yuandong Tian · Lara Scavuzzo · Giulia Zarpellon -
2021 Poster: Representing Long-Range Context for Graph Neural Networks with Global Attention »
Zhanghao Wu · Paras Jain · Matthew Wright · Azalia Mirhoseini · Joseph Gonzalez · Ion Stoica -
2021 Poster: NovelD: A Simple yet Effective Exploration Criterion »
Tianjun Zhang · Huazhe Xu · Xiaolong Wang · Yi Wu · Kurt Keutzer · Joseph Gonzalez · Yuandong Tian -
2021 Poster: MADE: Exploration via Maximizing Deviation from Explored Regions »
Tianjun Zhang · Paria Rashidinejad · Jiantao Jiao · Yuandong Tian · Joseph Gonzalez · Stuart Russell -
2021 Poster: K-level Reasoning for Zero-Shot Coordination in Hanabi »
Brandon Cui · Hengyuan Hu · Luis Pineda · Jakob Foerster -
2021 Poster: Taxonomizing local versus global structure in neural network loss landscapes »
Yaoqing Yang · Liam Hodgkinson · Ryan Theisen · Joe Zou · Joseph Gonzalez · Kannan Ramchandran · Michael Mahoney -
2020 : QA: Yuandong Tian »
Yuandong Tian -
2020 : Contributed Talk: Yuandong Tian »
Yuandong Tian -
2020 : Invited Talk (Yuandong Tian) »
Yuandong Tian -
2020 : Invited Speaker: Benoit Steiner »
Benoit Steiner -
2020 : Program Graphs for Machine Learning »
Chris Cummins -
2020 Poster: Boundary thickness and robustness in learning models »
Yaoqing Yang · Rajiv Khanna · Yaodong Yu · Amir Gholami · Kurt Keutzer · Joseph Gonzalez · Kannan Ramchandran · Michael Mahoney -
2020 Poster: A Statistical Framework for Low-bitwidth Training of Deep Neural Networks »
Jianfei Chen · Yu Gai · Zhewei Yao · Michael Mahoney · Joseph Gonzalez -
2020 Poster: Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search »
Linnan Wang · Rodrigo Fonseca · Yuandong Tian -
2020 Poster: Joint Policy Search for Multi-agent Collaboration with Imperfect Information »
Yuandong Tian · Qucheng Gong · Yu Jiang -
2019 Workshop: MLSys: Workshop on Systems for ML »
Aparna Lakshmiratan · Siddhartha Sen · Joseph Gonzalez · Dan Crankshaw · Sarah Bird -
2019 Poster: Coda: An End-to-End Neural Program Decompiler »
Cheng Fu · Huili Chen · Haolan Liu · Xinyun Chen · Yuandong Tian · Farinaz Koushanfar · Jishen Zhao -
2019 Poster: PyTorch: An Imperative Style, High-Performance Deep Learning Library »
Adam Paszke · Sam Gross · Francisco Massa · Adam Lerer · James Bradbury · Gregory Chanan · Trevor Killeen · Zeming Lin · Natalia Gimelshein · Luca Antiga · Alban Desmaison · Andreas Kopf · Edward Yang · Zachary DeVito · Martin Raison · Alykhan Tejani · Sasank Chilamkurthy · Benoit Steiner · Lu Fang · Junjie Bai · Soumith Chintala -
2019 Poster: ANODEV2: A Coupled Neural ODE Framework »
Tianjun Zhang · Zhewei Yao · Amir Gholami · Joseph Gonzalez · Kurt Keutzer · Michael Mahoney · George Biros -
2019 Poster: Hierarchical Decision Making by Generating and Following Natural Language Instructions »
Hengyuan Hu · Denis Yarats · Qucheng Gong · Yuandong Tian · Mike Lewis -
2019 Poster: One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers »
Ari Morcos · Haonan Yu · Michela Paganini · Yuandong Tian -
2019 Poster: Learning to Perform Local Rewriting for Combinatorial Optimization »
Xinyun Chen · Yuandong Tian -
2018 Poster: Speaker-Follower Models for Vision-and-Language Navigation »
Daniel Fried · Ronghang Hu · Volkan Cirik · Anna Rohrbach · Jacob Andreas · Louis-Philippe Morency · Taylor Berg-Kirkpatrick · Kate Saenko · Dan Klein · Trevor Darrell -
2017 Poster: ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games »
Yuandong Tian · Qucheng Gong · Wendy Shang · Yuxin Wu · Larry Zitnick -
2017 Oral: ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games »
Yuandong Tian · Qucheng Gong · Wendy Shang · Yuxin Wu · Larry Zitnick -
2015 Poster: On the Accuracy of Self-Normalized Log-Linear Models »
Jacob Andreas · Maxim Rabinovich · Michael Jordan · Dan Klein -
2014 Poster: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Demonstration: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2014 Spotlight: Unsupervised Transcription of Piano Music »
Taylor Berg-Kirkpatrick · Jacob Andreas · Dan Klein -
2009 Poster: Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs »
Alexandre Bouchard-Côté · Slav Petrov · Dan Klein -
2009 Spotlight: Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs »
Alexandre Bouchard-Côté · Slav Petrov · Dan Klein -
2008 Workshop: Speech and Language: Unsupervised Latent-Variable Models »
Slav Petrov · Aria Haghighi · Percy Liang · Dan Klein -
2008 Poster: Efficient Inference in Phylogenetic InDel Trees »
Alexandre Bouchard-Côté · Michael Jordan · Dan Klein -
2008 Spotlight: Efficient Inference in Phylogenetic InDel Trees »
Alexandre Bouchard-Côté · Michael Jordan · Dan Klein -
2007 Poster: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Spotlight: Agreement-Based Learning »
Percy Liang · Dan Klein · Michael Jordan -
2007 Session: Spotlights »
Dan Klein -
2007 Session: Spotlights »
Dan Klein -
2007 Spotlight: Discriminative Log-Linear Grammars with Latent Variables »
Slav Petrov · Dan Klein -
2007 Poster: Discriminative Log-Linear Grammars with Latent Variables »
Slav Petrov · Dan Klein -
2007 Poster: A Probabilistic Approach to Language Change »
Alexandre Bouchard-Côté · Percy Liang · Tom Griffiths · Dan Klein