Timezone: »
Poster
Adaptive Online Packing-guided Search for POMDPs
Chenyang Wu · Guoyu Yang · Zongzhang Zhang · Yang Yu · Dong Li · Wulong Liu · Jianye Hao
The partially observable Markov decision process (POMDP) provides a general framework for modeling an agent's decision process with state uncertainty, and online planning plays a pivotal role in solving it. A belief is a distribution of states representing state uncertainty. Methods for large-scale POMDP problems rely on the same idea of sampling both states and observations. That is, instead of exact belief updating, a collection of sampled states is used to approximate the belief; instead of considering all possible observations, only a set of sampled observations are considered. Inspired by this, we take one step further and propose an online planning algorithm, Adaptive Online Packing-guided Search (AdaOPS), to better approximate beliefs with adaptive particle filter technique and balance estimation bias and variance by fusing similar observation branches. Theoretically, our algorithm is guaranteed to find an $\epsilon$-optimal policy with a high probability given enough planning time under some mild assumptions. We evaluate our algorithm on several tricky POMDP domains, and it outperforms the state-of-the-art in all of them.
Author Information
Chenyang Wu (Nanjing University)
Guoyu Yang (Nanjing University)
Zongzhang Zhang (Nanjing University)
Yang Yu (Nanjing University)
Dong Li (Huawei Noah’s Ark Lab)
Wulong Liu (Huawei Noah's Ark Lab)
Jianye Hao (Tianjin University)
More from the Same Authors
-
2021 : OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning »
Jinyi Liu · Zhi Wang · YAN ZHENG · Jianye Hao · Junjie Ye · Chenjia Bai · Pengyi Li -
2021 : HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation »
Boyan Li · Hongyao Tang · YAN ZHENG · Jianye Hao · Pengyi Li · Zhaopeng Meng · LI Wang -
2021 : PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Sang Tong · YAN ZHENG · Jianye Hao · Matthew Taylor · Jinyi Liu -
2022 Poster: Versatile Multi-stage Graph Neural Network for Circuit Representation »
shuwen yang · Zhihao Yang · Dong Li · Yingxueff Zhang · Zhanguang Zhang · Guojie Song · Jianye Hao -
2022 Spotlight: Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning »
Chenyang Wu · Tianci Li · Zongzhang Zhang · Yang Yu -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Spotlight: Lightning Talks 3A-2 »
shuwen yang · Xu Zhang · Delvin Ce Zhang · Lan-Zhe Guo · Renzhe Xu · Zhuoer Xu · Yao-Xiang Ding · Weihan Li · Xingxuan Zhang · Xi-Zhu Wu · Zhenyuan Yuan · Hady Lauw · Yu Qi · Yi-Ge Zhang · Zhihao Yang · Guanghui Zhu · Dong Li · Changhua Meng · Kun Zhou · Gang Pan · Zhi-Fan Wu · Bo Li · Minghui Zhu · Zhi-Hua Zhou · Yafeng Zhang · Yingxueff Zhang · shiwen cui · Jie-Jing Shao · Zhanguang Zhang · Zhenzhe Ying · Xiaolong Chen · Yu-Feng Li · Guojie Song · Peng Cui · Weiqiang Wang · Ming GU · Jianye Hao · Yihua Huang -
2022 Spotlight: Versatile Multi-stage Graph Neural Network for Circuit Representation »
shuwen yang · Zhihao Yang · Dong Li · Yingxueff Zhang · Zhanguang Zhang · Guojie Song · Jianye Hao -
2022 Poster: Conformalized Fairness via Quantile Regression »
Meichen Liu · Lei Ding · Dengdeng Yu · Wulong Liu · Linglong Kong · Bei Jiang -
2022 Poster: Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning »
Chenyang Wu · Tianci Li · Zongzhang Zhang · Yang Yu -
2021 : HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation Q&A »
Boyan Li · Hongyao Tang · YAN ZHENG · Jianye Hao · Pengyi Li · Zhaopeng Meng · LI Wang -
2021 : HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation »
Boyan Li · Hongyao Tang · YAN ZHENG · Jianye Hao · Pengyi Li · Zhaopeng Meng · LI Wang -
2021 Poster: Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning »
Xiong-Hui Chen · Shengyi Jiang · Feng Xu · Zongzhang Zhang · Yang Yu -
2021 Poster: Model-Based Reinforcement Learning via Imagination with Derived Memory »
Yao Mu · Yuzheng Zhuang · Bin Wang · Guangxiang Zhu · Wulong Liu · Jianyu Chen · Ping Luo · Shengbo Li · Chongjie Zhang · Jianye Hao -
2021 Poster: Regret Minimization Experience Replay in Off-Policy Reinforcement Learning »
Xu-Hui Liu · Zhenghai Xue · Jingcheng Pang · Shengyi Jiang · Feng Xu · Yang Yu -
2021 Poster: A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems »
Yi Ma · Xiaotian Hao · Jianye Hao · Jiawen Lu · Xing Liu · Tong Xialiang · Mingxuan Yuan · Zhigang Li · Jie Tang · Zhaopeng Meng -
2021 Poster: Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning »
Danruo DENG · Guangyong Chen · Jianye Hao · Qiong Wang · Pheng-Ann Heng -
2021 Poster: An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning »
Tianpei Yang · Weixun Wang · Hongyao Tang · Jianye Hao · Zhaopeng Meng · Hangyu Mao · Dong Li · Wulong Liu · Yingfeng Chen · Yujing Hu · Changjie Fan · Chengwei Zhang -
2021 Poster: Offline Model-based Adaptable Policy Learning »
Xiong-Hui Chen · Yang Yu · Qingyang Li · Fan-Ming Luo · Zhiwei Qin · Wenjie Shang · Jieping Ye -
2021 Poster: S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks »
Xinlin Li · Bang Liu · Yaoliang Yu · Wulong Liu · Chunjing XU · Vahid Partovi Nia -
2021 Poster: Dynamic Bottleneck for Robust Self-Supervised Exploration »
Chenjia Bai · Lingxiao Wang · Lei Han · Animesh Garg · Jianye Hao · Peng Liu · Zhaoran Wang -
2018 : Coffee Break and Poster Session I »
Pim de Haan · Bin Wang · Dequan Wang · Aadil Hayat · Ibrahim Sobh · Muhammad Asif Rana · Thibault Buhet · Nicholas Rhinehart · Arjun Sharma · Alex Bewley · Michael Kelly · Lionel Blondé · Ozgur S. Oguz · Vaibhav Viswanathan · Jeroen Vanbaar · Konrad Żołna · Negar Rostamzadeh · Rowan McAllister · Sanjay Thakur · Alexandros Kalousis · Chelsea Sidrane · Sujoy Paul · Daphne Chen · Michal Garmulewicz · Henryk Michalewski · Coline Devin · Hongyu Ren · Jiaming Song · Wen Sun · Hanzhang Hu · Wulong Liu · Emilie Wirbel -
2018 Poster: Multi-Layered Gradient Boosting Decision Trees »
Ji Feng · Yang Yu · Zhi-Hua Zhou -
2018 Poster: A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents »
YAN ZHENG · Zhaopeng Meng · Jianye Hao · Zongzhang Zhang · Tianpei Yang · Changjie Fan -
2015 Poster: Subset Selection by Pareto Optimization »
Chao Qian · Yang Yu · Zhi-Hua Zhou