Timezone: »
Spotlight
RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning
Marek Petrik · Dharmashankar Subramanian
Wed Dec 10 12:30 PM -- 12:50 PM (PST) @ Level 2, room 210
We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the gamma-discounted infinite horizon performance loss by a factor of 1/(1-gamma) while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.
Author Information
Marek Petrik (University of New Hampshire)
Dharmashankar Subramanian (IBM Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2014 Poster: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Thu. Dec 11th 12:00 -- 04:59 AM Room Level 2, room 210D
More from the Same Authors
-
2021 : Behavior Policy Search for Risk Estimators in Reinforcement Learning »
Elita Lobo · Marek Petrik · Dharmashankar Subramanian -
2023 Poster: Pairwise Causality Guided Transformers for Event Sequences »
Xiao Shou · Debarun Bhattacharjya · Tian Gao · Dharmashankar Subramanian · Oktie Hassanzadeh · Kristin P Bennett -
2021 Poster: Causal Inference for Event Pairs in Multivariate Point Processes »
Tian Gao · Dharmashankar Subramanian · Debarun Bhattacharjya · Xiao Shou · Nicholas Mattei · Kristin P Bennett -
2018 Poster: Proximal Graphical Event Models »
Debarun Bhattacharjya · Dharmashankar Subramanian · Tian Gao -
2018 Spotlight: Proximal Graphical Event Models »
Debarun Bhattacharjya · Dharmashankar Subramanian · Tian Gao -
2016 Poster: Safe Policy Improvement by Minimizing Robust Baseline Regret »
Mohammad Ghavamzadeh · Marek Petrik · Yinlam Chow -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar