Timezone: »
Poster
Robust $\phi$-Divergence MDPs
Chin Pang Ho · Marek Petrik · Wolfram Wiesemann
In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view of the most adverse transition kernel from a prescribed ambiguity set. In this paper, we develop a novel solution framework for robust MDPs with $s$-rectangular ambiguity sets that decomposes the problem into a sequence of robust Bellman updates and simplex projections. Exploiting the rich structure present in the simplex projections corresponding to $\phi$-divergence ambiguity sets, we show that the associated $s$-rectangular robust MDPs can be solved substantially faster than with state-of-the-art commercial solvers as well as a recent first-order solution scheme, thus rendering them attractive alternatives to classical MDPs in practical applications.
Author Information
Chin Pang Ho (City University of Hong Kong)
Marek Petrik (University of New Hampshire)
Wolfram Wiesemann (Imperial College)
More from the Same Authors
-
2021 : Unbiased Efficient Feature Counts for Inverse RL »
Gerard Donahue · Brendan Crowe · Marek Petrik · Daniel Brown -
2021 : Behavior Policy Search for Risk Estimators in Reinforcement Learning »
Elita Lobo · Marek Petrik · Dharmashankar Subramanian -
2023 Poster: Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount Factor »
Julien Grand-Clément · Marek Petrik -
2023 Poster: Percentile Criterion Optimization in Offline Reinforcement Learning »
Cyrus Cousins · Elita Lobo · Marek Petrik · Yair Zick -
2023 Poster: On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes »
Jia Lin Hau · Erick Delage · Mohammad Ghavamzadeh · Marek Petrik -
2023 Poster: Improving the Knowledge Gradient Algorithm »
Le Yang · Siyang Gao · Chin Pang Ho -
2023 Poster: Fast Bellman Updates for Wasserstein Distributionally Robust MDPs »
Zhuodong Yu · Ling Dai · Shaohang Xu · Siyang Gao · Chin Pang Ho -
2022 Poster: Wasserstein Logistic Regression with Mixed Features »
Aras Selvi · Mohammad Reza Belbasi · Martin Haugh · Wolfram Wiesemann -
2021 : Safe RL Panel Discussion »
Animesh Garg · Marek Petrik · Shie Mannor · Claire Tomlin · Ugo Rosolia · Dylan Hadfield-Menell -
2021 Workshop: Safe and Robust Control of Uncertain Systems »
Ashwin Balakrishna · Brijen Thananjeyan · Daniel Brown · Marek Petrik · Melanie Zeilinger · Sylvia Herbert -
2021 Poster: Fast Algorithms for $L_\infty$-constrained S-rectangular Robust MDPs »
Bahram Behzadian · Marek Petrik · Chin Pang Ho -
2020 Poster: Bayesian Robust Optimization for Imitation Learning »
Daniel S. Brown · Scott Niekum · Marek Petrik -
2019 Workshop: Safety and Robustness in Decision-making »
Mohammad Ghavamzadeh · Shie Mannor · Yisong Yue · Marek Petrik · Yinlam Chow -
2019 Poster: Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization »
Viet Anh Nguyen · Soroosh Shafieezadeh Abadeh · Man-Chung Yue · Daniel Kuhn · Wolfram Wiesemann -
2019 Poster: Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation »
Viet Anh Nguyen · Soroosh Shafieezadeh Abadeh · Man-Chung Yue · Daniel Kuhn · Wolfram Wiesemann -
2019 Poster: Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs »
Marek Petrik · Reazul Hasan Russel -
2018 Poster: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes »
Andrea Tirinzoni · Marek Petrik · Xiangli Chen · Brian Ziebart -
2018 Spotlight: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes »
Andrea Tirinzoni · Marek Petrik · Xiangli Chen · Brian Ziebart -
2016 Poster: Safe Policy Improvement by Minimizing Robust Baseline Regret »
Mohammad Ghavamzadeh · Marek Petrik · Yinlam Chow -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Marek Petrik · Dharmashankar Subramanian -
2014 Spotlight: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Marek Petrik · Dharmashankar Subramanian