Timezone: »
When observed decisions depend only on observed features, off-policy policy evaluation (OPE) methods for sequential decision problems can estimate the performance of evaluation policies before deploying them. However, this assumption is frequently violated due to unobserved confounders, unrecorded variables that impact both the decisions and their outcomes. We assess robustness of OPE methods under unobserved confounding by developing worst-case bounds on the performance of an evaluation policy. When unobserved confounders can affect every decision in an episode, we demonstrate that even small amounts of per-decision confounding can heavily bias OPE methods. Fortunately, in a number of important settings found in healthcare, policy-making, and technology, unobserved confounders may directly affect only one of the many decisions made, and influence future decisions/rewards only through the directly affected decision. Under this less pessimistic model of one-decision confounding, we propose an efficient loss-minimization-based procedure for computing worst-case bounds, and prove its statistical consistency. On simulated healthcare examples---management of sepsis and interventions for autistic children---where this is a reasonable model, we demonstrate that our method invalidates non-robust results and provides meaningful certificates of robustness, allowing reliable selection of policies under unobserved confounding.
Author Information
Hongseok Namkoong (Stanford University)
Ramtin Keramati (Stanford University)
Steve Yadlowsky (Google Research, Brain Team)
Emma Brunskill (Stanford University)
More from the Same Authors
-
2021 Spotlight: SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression »
Steve Yadlowsky · Taedong Yun · Cory Y McLean · Alexander D'Amour -
2021 Spotlight: Counterfactual Invariance to Spurious Correlations in Text Classification »
Victor Veitch · Alexander D'Amour · Steve Yadlowsky · Jacob Eisenstein -
2021 : Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation »
Ramtin Keramati · Omer Gottesman · Leo Celi · Finale Doshi-Velez · Emma Brunskill -
2021 : Robust fine-tuning of zero-shot models »
Mitchell Wortsman · Gabriel Ilharco · Jong Wook Kim · Mike Li · Hanna Hajishirzi · Ali Farhadi · Hongseok Namkoong · Ludwig Schmidt -
2022 : Tailored Overlap for Learning Under Distribution Shift »
David Bruns-Smith · Alexander D'Amour · Avi Feller · Steve Yadlowsky -
2023 Poster: In-Context Decision-Making from Supervised Pretraining »
Jonathan N Lee · Annie Xie · Aldo Pacchiano · Yash Chandak · Chelsea Finn · Ofir Nachum · Emma Brunskill -
2023 Poster: Experiment Planning with Function Approximation »
Aldo Pacchiano · Jonathan N Lee · Emma Brunskill -
2023 Poster: Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization »
Sanath Kumar Krishnamurthy · Ruohan Zhan · Susan Athey · Emma Brunskill -
2023 Poster: Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets »
Anirudhan Badrinath · Yannis Flet-Berliac · Allen Nie · Emma Brunskill -
2022 Workshop: Reinforcement Learning for Real Life (RL4RealLife) Workshop »
Yuxi Li · Emma Brunskill · MINMIN CHEN · Omer Gottesman · Lihong Li · Yao Liu · Zhiwei Tony Qin · Matthew Taylor -
2022 Poster: Oracle Inequalities for Model Selection in Offline Reinforcement Learning »
Jonathan N Lee · George Tucker · Ofir Nachum · Bo Dai · Emma Brunskill -
2022 Poster: Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits »
Tong Mu · Yash Chandak · Tatsunori Hashimoto · Emma Brunskill -
2022 Poster: Off-Policy Evaluation for Action-Dependent Non-stationary Environments »
Yash Chandak · Shiv Shankar · Nathaniel Bastian · Bruno da Silva · Emma Brunskill · Philip Thomas -
2022 Poster: Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data »
Allen Nie · Yannis Flet-Berliac · Deon Jordan · William Steenbergen · Emma Brunskill -
2022 Poster: Giving Feedback on Interactive Student Programs with Meta-Exploration »
Evan Liu · Moritz Stephan · Allen Nie · Chris Piech · Emma Brunskill · Chelsea Finn -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 : Safe RL Debate »
Sylvia Herbert · Animesh Garg · Emma Brunskill · Aleksandra Faust · Dylan Hadfield-Menell -
2021 Poster: Play to Grade: Testing Coding Games as Classifying Markov Decision Process »
Allen Nie · Emma Brunskill · Chris Piech -
2021 Poster: Reinforcement Learning with State Observation Costs in Action-Contingent Noiselessly Observable Markov Decision Processes »
HyunJi Alex Nam · Scott Fleming · Emma Brunskill -
2021 Poster: Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning »
Andrea Zanette · Martin J Wainwright · Emma Brunskill -
2021 Poster: SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression »
Steve Yadlowsky · Taedong Yun · Cory Y McLean · Alexander D'Amour -
2021 Poster: Universal Off-Policy Evaluation »
Yash Chandak · Scott Niekum · Bruno da Silva · Erik Learned-Miller · Emma Brunskill · Philip Thomas -
2021 Poster: Evaluating model performance under worst-case subpopulations »
Mike Li · Hongseok Namkoong · Shangzhou Xia -
2021 Poster: Design of Experiments for Stochastic Contextual Linear Bandits »
Andrea Zanette · Kefan Dong · Jonathan N Lee · Emma Brunskill -
2021 Poster: Counterfactual Invariance to Spurious Correlations in Text Classification »
Victor Veitch · Alexander D'Amour · Steve Yadlowsky · Jacob Eisenstein -
2020 : Counterfactuals and Offline RL »
Emma Brunskill -
2020 : Q & A and Panel Session with Dan Weld, Kristen Grauman, Scott Yih, Emma Brunskill, and Alex Ratner »
Kristen Grauman · Wen-tau Yih · Alexander Ratner · Emma Brunskill · Douwe Kiela · Daniel S. Weld -
2020 : Contributed Talk 7: Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning »
Samuel Daulton · Hongseok Namkoong -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 : Mini-panel discussion 1 - Bridging the gap between theory and practice »
Aviv Tamar · Emma Brunskill · Jost Tobias Springenberg · Omer Gottesman · Daniel Mankowitz -
2020 : Keynote: Emma Brunskill »
Emma Brunskill -
2020 : Panel discussion on minimizing bias in machine learning in education »
Neil Heffernan · Osonde A. Osoba · Emma Brunskill · Kathi Fisler -
2020 Poster: Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration »
Andrea Zanette · Alessandro Lazaric · Mykel J Kochenderfer · Emma Brunskill -
2020 Poster: Provably Good Batch Reinforcement Learning Without Great Exploration »
Yao Liu · Adith Swaminathan · Alekh Agarwal · Emma Brunskill -
2019 : Emma Brünskill, "Some Theory RL Challenges Inspired by Education" »
Emma Brunskill -
2019 : Invited Talk »
Emma Brunskill -
2019 : Coffee break, posters, and 1-on-1 discussions »
Yangyi Lu · Daniel Chen · Hongseok Namkoong · Marie Charpignon · Maja Rudolph · Amanda Coston · Julius von Kügelgen · Niranjani Prasad · Paramveer Dhillon · Yunzong Xu · Yixin Wang · Alexander Markham · David Rohde · Rahul Singh · Zichen Zhang · Negar Hassanpour · Ankit Sharma · Ciarán Lee · Jean Pouget-Abadie · Jesse Krijthe · Divyat Mahajan · Nan Rosemary Ke · Peter Wirnsberger · Vira Semenova · Dmytro Mykhaylov · Dennis Shen · Kenta Takatsu · Liyang Sun · Jeremy Yang · Alexander Franks · Pak Kan Wong · Tauhid Zaman · Shira Mitchell · min kyoung kang · Qi Yang -
2019 : Poster Spotlights »
Hongseok Namkoong · Marie Charpignon · Maja Rudolph · Amanda Coston · Yuta Saito · Paramveer Dhillon · Alexander Markham -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 Poster: Offline Contextual Bandits with High Probability Fairness Guarantees »
Blossom Metevier · Stephen Giguere · Sarah Brockman · Ari Kobren · Yuriy Brun · Emma Brunskill · Philip Thomas -
2019 Poster: Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model »
Andrea Zanette · Mykel J Kochenderfer · Emma Brunskill -
2019 Poster: Limiting Extrapolation in Linear Approximate Value Iteration »
Andrea Zanette · Alessandro Lazaric · Mykel J Kochenderfer · Emma Brunskill -
2018 Poster: Representation Balancing MDPs for Off-policy Policy Evaluation »
Yao Liu · Omer Gottesman · Aniruddh Raghu · Matthieu Komorowski · Aldo Faisal · Finale Doshi-Velez · Emma Brunskill -
2018 Poster: Generalizing to Unseen Domains via Adversarial Data Augmentation »
Riccardo Volpi · Hongseok Namkoong · Ozan Sener · John Duchi · Vittorio Murino · Silvio Savarese -
2018 Demonstration: Automatic Curriculum Generation Applied to Teaching Novices a Short Bach Piano Segment »
Emma Brunskill · Tong Mu · Karan Goel · Jonathan Bragg -
2018 Poster: Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation »
Matthew O'Kelly · Aman Sinha · Hongseok Namkoong · Russ Tedrake · John Duchi -
2017 : Panel Discussion »
Matt Botvinick · Emma Brunskill · Marcos Campos · Jan Peters · Doina Precup · David Silver · Josh Tenenbaum · Roy Fox -
2017 : Sample efficiency and off policy hierarchical RL (Emma Brunskill) »
Emma Brunskill -
2017 : Emma Brunskill (Stanford) »
Emma Brunskill -
2017 : Invited Talk »
Emma Brunskill -
2017 Poster: Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation »
Zhaohan Guo · Philip S. Thomas · Emma Brunskill -
2017 Poster: Variance-based Regularization with Convex Objectives »
Hongseok Namkoong · John Duchi -
2017 Poster: Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning »
Christoph Dann · Tor Lattimore · Emma Brunskill -
2017 Spotlight: Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning »
Christoph Dann · Tor Lattimore · Emma Brunskill -
2017 Oral: Variance-based Regularization with Convex Objectives »
Hongseok Namkoong · John Duchi -
2017 Tutorial: Reinforcement Learning with People »
Emma Brunskill -
2016 Poster: Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences »
Hongseok Namkoong · John Duchi