Timezone: »
Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
Author Information
Jongmin Lee (KAIST)
Geon-Hyeong Kim (KAIST)
Pascal Poupart (University of Waterloo & RBC Borealis AI)
Kee-Eung Kim (KAIST)
More from the Same Authors
-
2022 Poster: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 : Geometric attacks on batch normalization »
Amur Ghose · Apurv Gupta · Yaoliang Yu · Pascal Poupart -
2022 Spotlight: Optimality and Stability in Non-Convex Smooth Games »
Guojun Zhang · Pascal Poupart · Yaoliang Yu -
2022 : Attribute Controlled Dialogue Prompting »
Runcheng Liu · Ahmad Rashid · Ivan Kobyzev · Mehdi Rezaghoizadeh · Pascal Poupart -
2022 Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II) »
Mehdi Rezagholizadeh · Peyman Passban · Yue Dong · Lili Mou · Pascal Poupart · Ali Ghodsi · Qun Liu -
2022 Poster: Uncertainty-Aware Reinforcement Learning for Risk-Sensitive Player Evaluation in Sports Game »
Guiliang Liu · Yudong Luo · Oliver Schulte · Pascal Poupart -
2021 : Best Papers and Closing Remarks »
Ali Ghodsi · Pascal Poupart -
2021 : Panel Discussion »
Pascal Poupart · Ali Ghodsi · Luke Zettlemoyer · Sameer Singh · Kevin Duh · Yejin Choi · Lu Hou -
2021 Workshop: Efficient Natural Language and Speech Processing (Models, Training, and Inference) »
Mehdi Rezaghoizadeh · Lili Mou · Yue Dong · Pascal Poupart · Ali Ghodsi · Qun Liu -
2021 : Opening Speech »
Pascal Poupart -
2021 Poster: Multi-View Representation Learning via Total Correlation Objective »
HyeongJoo Hwang · Geon-Hyeong Kim · Seunghoon Hong · Kee-Eung Kim -
2021 Poster: Quantifying and Improving Transferability in Domain Generalization »
Guojun Zhang · Han Zhao · Yaoliang Yu · Pascal Poupart -
2021 Poster: Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning »
Guiliang Liu · Xiangyu Sun · Oliver Schulte · Pascal Poupart -
2020 Poster: Learning Agent Representations for Ice Hockey »
Guiliang Liu · Oliver Schulte · Pascal Poupart · Mike Rudd · Mehrsan Javan -
2020 Poster: Variational Interaction Information Maximization for Cross-domain Disentanglement »
HyeongJoo Hwang · Geon-Hyeong Kim · Seunghoon Hong · Kee-Eung Kim -
2020 Poster: Learning Dynamic Belief Graphs to Generalize on Text-Based Games »
Ashutosh Adhikari · Xingdi Yuan · Marc-Alexandre Côté · Mikuláš Zelinka · Marc-Antoine Rondeau · Romain Laroche · Pascal Poupart · Jian Tang · Adam Trischler · Will Hamilton -
2020 Poster: Reinforcement Learning for Control with Multiple Frequencies »
Jongmin Lee · Byung-Jun Lee · Kee-Eung Kim -
2018 Workshop: Reinforcement Learning under Partial Observability »
Joni Pajarinen · Chris Amato · Pascal Poupart · David Hsu -
2018 Poster: Deep Homogeneous Mixture Models: Representation, Separation, and Approximation »
Priyank Jaini · Pascal Poupart · Yaoliang Yu -
2018 Poster: Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks »
Agastya Kalra · Abdullah Rashwan · Wei-Shou Hsu · Pascal Poupart · Prashant Doshi · George Trimponias -
2018 Poster: Unsupervised Video Object Segmentation for Deep Reinforcement Learning »
Vikash Goel · Jameson Weng · Pascal Poupart -
2018 Poster: A Bayesian Approach to Generative Adversarial Imitation Learning »
Wonseok Jeon · Seokin Seo · Kee-Eung Kim -
2018 Spotlight: A Bayesian Approach to Generative Adversarial Imitation Learning »
Wonseok Jeon · Seokin Seo · Kee-Eung Kim -
2017 Poster: Generative Local Metric Learning for Kernel Regression »
Yung-Kyun Noh · Masashi Sugiyama · Kee-Eung Kim · Frank Park · Daniel Lee -
2016 Poster: Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics »
Wei-Shou Hsu · Pascal Poupart -
2016 Poster: A Unified Approach for Learning the Parameters of Sum-Product Networks »
Han Zhao · Pascal Poupart · Geoffrey Gordon -
2012 Poster: Cost-Sensitive Exploration in Bayesian Reinforcement Learning »
Dongho Kim · Kee-Eung Kim · Pascal Poupart -
2012 Poster: Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions »
Jaedeug Choi · Kee-Eung Kim -
2011 Poster: MAP Inference for Bayesian Inverse Reinforcement Learning »
Jaedeug Choi · Kee-Eung Kim