Timezone: »
Spotlight
RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning
Marek Petrik · Dharmashankar Subramanian
Wed Dec 10 12:30 PM -- 12:50 PM (PST) @ Level 2, room 210
We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the gamma-discounted infinite horizon performance loss by a factor of 1/(1-gamma) while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.
Author Information
Marek Petrik (University of New Hampshire)
Shankar Subramanian (IBM Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2014 Poster: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Thu Dec 11th 12:00 -- 04:59 AM Room Level 2, room 210D
More from the Same Authors
-
2018 Poster: Proximal Graphical Event Models »
Debarun Bhattacharjya · Dharmashankar Subramanian · Tian Gao -
2018 Spotlight: Proximal Graphical Event Models »
Debarun Bhattacharjya · Dharmashankar Subramanian · Tian Gao -
2016 Poster: Safe Policy Improvement by Minimizing Robust Baseline Regret »
Mohammad Ghavamzadeh · Marek Petrik · Yinlam Chow -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar