Timezone: »
In this paper we address the problem of decision making within a Markov decision process (MDP) framework where risk and modeling errors are taken into account. Our approach is to minimize a risk-sensitive conditional-value-at-risk (CVaR) objective, as opposed to a standard risk-neutral expectation. We refer to such problem as CVaR MDP. Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget. This result, which is of independent interest, motivates CVaR MDPs as a unifying framework for risk-sensitive and robust decision making. Our second contribution is to present a value-iteration algorithm for CVaR MDPs, and analyze its convergence rate. To our knowledge, this is the first solution algorithm for CVaR MDPs that enjoys error guarantees. Finally, we present results from numerical experiments that corroborate our theoretical findings and show the practicality of our approach.
Author Information
Yinlam Chow (Stanford)
Aviv Tamar (UC Berkeley)
Shie Mannor (Technion)
Marco Pavone (Stanford University)
More from the Same Authors
-
2021 Spotlight: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 : Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Guy Tennenholtz · Assaf Hallak · Gal Dalal · Shie Mannor · Gal Chechik · Uri Shalit -
2021 : Latent Geodesics of Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Nir Baram · Shie Mannor -
2021 : Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning »
Roy Zohar · Shie Mannor · Guy Tennenholtz -
2022 : Foundation Models for Semantic Novelty in Reinforcement Learning »
Tarun Gupta · Peter Karkus · Tong Che · Danfei Xu · Marco Pavone -
2022 : DiffStack: A Differentiable and Modular Control Stack for Autonomous Vehicles »
Peter Karkus · Boris Ivanovic · Shie Mannor · Marco Pavone -
2022 : Robust Trajectory Prediction against Adversarial Attacks »
Yulong Cao · Danfei Xu · Xinshuo Weng · Zhuoqing Morley Mao · Anima Anandkumar · Chaowei Xiao · Marco Pavone -
2022 : AdvDO: Realistic Adversarial Attacks for Trajectory Prediction »
Yulong Cao · Chaowei Xiao · Anima Anandkumar · Danfei Xu · Marco Pavone -
2022 : Conformal Semantic Keypoint Detection with Statistical Guarantees »
Heng Yang · Marco Pavone -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : Expanding the Deployment Envelope of Behavior Prediction via Adaptive Meta-Learning »
Boris Ivanovic · James Harrison · Marco Pavone -
2022 : SoftTreeMax: Policy Gradient with Tree Search »
Gal Dalal · Assaf Hallak · Shie Mannor · Gal Chechik -
2022 : Conformal Semantic Keypoint Detection with Statistical Guarantees »
Heng Yang · Marco Pavone -
2023 Poster: PAC-Bayes Generalization Certificates for Learned Inductive Conformal Prediction »
Apoorva Sharma · Sushant Veer · Asher Hancock · Heng Yang · Marco Pavone · Anirudha Majumdar -
2023 Poster: Policy Gradient for Rectangular Robust Markov Decision Processes »
Navdeep Kumar · Esther Derman · Matthieu Geist · Kfir Y. Levy · Shie Mannor -
2023 Poster: Individualized Dosing Dynamics via Neural Eigen Decomposition »
Stav Belogolovsky · Ido Greenberg · Danny Eytan · Shie Mannor -
2023 Poster: Train Hard, Fight Easy: Robust Meta Reinforcement Learning »
Ido Greenberg · Shie Mannor · Gal Chechik · Eli Meirom -
2023 Poster: Optimization or Architecture: What Matters in Non-Linear Filtering? »
Ido Greenberg · Netanel Yannay · Shie Mannor -
2023 Poster: trajdata: A Unified Interface to Multiple Human Trajectory Datasets »
Boris Ivanovic · Guanyu Song · Igor Gilitschenski · Marco Pavone -
2022 : Invited Talk: Marco Pavone »
Marco Pavone -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 Poster: Tractable Optimality in Episodic Latent MABs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2022 Poster: Reinforcement Learning with a Terminator »
Guy Tennenholtz · Nadav Merlis · Lior Shani · Shie Mannor · Uri Shalit · Gal Chechik · Assaf Hallak · Gal Dalal -
2022 Poster: Finite Sample Analysis Of Dynamic Regression Parameter Learning »
Mark Kozdoba · Edward Moroshko · Shie Mannor · Yacov Crammer -
2022 Poster: Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Shie Mannor -
2022 Poster: Efficient Risk-Averse Reinforcement Learning »
Ido Greenberg · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor -
2021 : Safe RL Panel Discussion »
Animesh Garg · Marek Petrik · Shie Mannor · Claire Tomlin · Ugo Rosolia · Dylan Hadfield-Menell -
2021 : Shie Mannor »
Shie Mannor -
2021 : Shie Mannor »
Shie Mannor -
2021 Poster: Data Sharing and Compression for Cooperative Networked Control »
Jiangnan Cheng · Marco Pavone · Sachin Katti · Sandeep Chinchali · Ao Tang -
2021 Poster: Twice regularized MDPs and the equivalence between robustness and regularization »
Esther Derman · Matthieu Geist · Shie Mannor -
2021 Poster: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 Poster: Sim and Real: Better Together »
Shirli Di-Castro · Dotan Di Castro · Shie Mannor -
2021 Poster: Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction »
Gal Dalal · Assaf Hallak · Steven Dalton · iuri frosio · Shie Mannor · Gal Chechik -
2021 Poster: Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2020 Poster: Continuous Meta-Learning without Tasks »
James Harrison · Apoorva Sharma · Chelsea Finn · Marco Pavone -
2020 Poster: Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders »
Masha Itkina · Boris Ivanovic · Ransalu Senanayake · Mykel J Kochenderfer · Marco Pavone -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Adaptive Trust Region Policy Optimization: Convergence and Faster Rates of regularized MDPs »
Lior Shani · Yonathan Efroni · Shie Mannor -
2019 : Marco Pavone: On Safe and Efficient Human-robot Interactions via Multi-modal Intent Modeling and Reachability-based Safety Assurance »
Marco Pavone -
2019 : Aviv Tamar: Visual Plan Imagination - An Interpretable Robot Learning Framework »
Aviv Tamar -
2019 Workshop: Safety and Robustness in Decision-making »
Mohammad Ghavamzadeh · Shie Mannor · Yisong Yue · Marek Petrik · Yinlam Chow -
2019 Poster: High-Dimensional Optimization in Adaptive Random Subspaces »
Jonathan Lacotte · Mert Pilanci · Marco Pavone -
2019 Poster: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2019 Spotlight: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2018 : Panel »
Yimeng Zhang · Alfredo Canziani · Marco Pavone · Dorsa Sadigh · Kurt Keutzer -
2018 : Invited Talk: Marco Pavone, Stanford »
Marco Pavone -
2018 Poster: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2018 Spotlight: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2017 Poster: Rotting Bandits »
Nir Levine · Yacov Crammer · Shie Mannor -
2017 Poster: Shallow Updates for Deep Reinforcement Learning »
Nir Levine · Tom Zahavy · Daniel J Mankowitz · Aviv Tamar · Shie Mannor -
2016 Poster: Safe Policy Improvement by Minimizing Robust Baseline Regret »
Mohammad Ghavamzadeh · Marek Petrik · Yinlam Chow -
2016 Poster: Adaptive Skills Adaptive Partitions (ASAP) »
Daniel J Mankowitz · Timothy A Mann · Shie Mannor -
2015 : Between stochastic and adversarial: forecasting with online ARMA models »
Shie Mannor -
2015 Workshop: Machine Learning for (e-)Commerce »
Esteban Arcaute · Mohammad Ghavamzadeh · Shie Mannor · Georgios Theocharous -
2015 Poster: Online Learning for Adversaries with Memory: Price of Past Mistakes »
Oren Anava · Elad Hazan · Shie Mannor -
2015 Poster: Policy Gradient for Coherent Risk Measures »
Aviv Tamar · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor -
2015 Poster: Community Detection via Measure Space Embedding »
Mark Kozdoba · Shie Mannor -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Poster: Robust Logistic Regression and Classification »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2014 Oral: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Poster: Algorithms for CVaR Optimization in MDPs »
Yinlam Chow · Mohammad Ghavamzadeh -
2013 Poster: Reinforcement Learning in Robust Markov Decision Processes »
Shiau Hong Lim · Huan Xu · Shie Mannor -
2013 Poster: Online PCA for Contaminated Data »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2013 Poster: Learning Multiple Models via Regularized Weighting »
Daniel Vainsencher · Shie Mannor · Huan Xu -
2012 Poster: The Perturbed Variation »
Maayan Harel · Shie Mannor -
2011 Poster: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Spotlight: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Poster: Committing Bandits »
Loc X Bui · Ramesh Johari · Shie Mannor -
2010 Spotlight: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Distributionally Robust Markov Decision Processes »
Huan Xu · Shie Mannor