Timezone: »
Off-policy evaluation of sequential decision policies from observational data is necessary in applications of batch reinforcement learning such as education and healthcare. In such settings, however, unobserved variables confound observed actions, rendering exact evaluation of new policies impossible, i.e, unidentifiable. We develop a robust approach that estimates sharp bounds on the (unidentifiable) value of a given policy in an infinite-horizon problem given data from another policy with unobserved confounding, subject to a sensitivity model. We consider stationary unobserved confounding and compute bounds by optimizing over the set of all stationary state-occupancy ratios that agree with a new partially identified estimating equation and the sensitivity model. We prove convergence to the sharp bounds as we collect more confounded data. Although checking set membership is a linear program, the support function is given by a difficult nonconvex optimization problem. We develop approximations based on nonconvex projected gradient descent and demonstrate the resulting bounds empirically.
Author Information
Nathan Kallus (Cornell University)
Angela Zhou (Cornell University)
More from the Same Authors
-
2021 : It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks »
Michelle Bao · Angela Zhou · Samantha Zottola · Brian Brubach · Sarah Desmarais · Aaron Horowitz · Kristian Lum · Suresh Venkatasubramanian -
2021 : Stateful Offline Contextual Policy Evaluation and Learning »
Angela Zhou -
2023 Poster: Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage »
Masatoshi Uehara · Nathan Kallus · Jason Lee · Wen Sun -
2023 Poster: The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning »
Kaiwen Wang · Kevin Zhou · Runzhe Wu · Nathan Kallus · Wen Sun -
2023 Poster: Future-Dependent Value-Based Off-Policy Evaluation in POMDPs »
Masatoshi Uehara · Haruka Kiyohara · Andrew Bennett · Victor Chernozhukov · Nan Jiang · Nathan Kallus · Chengchun Shi · Wen Sun -
2022 Panel: Panel 3C-5: Biologically-Plausible Determinant Maximization… & What's the Harm? ... »
Bariscan Bozkurt · Nathan Kallus -
2022 Poster: Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems »
Masatoshi Uehara · Ayush Sekhari · Jason Lee · Nathan Kallus · Wen Sun -
2022 Poster: The Implicit Delta Method »
Nathan Kallus · James McInerney -
2022 Poster: What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment »
Nathan Kallus -
2021 Workshop: Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice »
Aurelien Bibaut · Maria Dimakopoulou · Nathan Kallus · Xinkun Nie · Masatoshi Uehara · Kelly Zhang -
2021 : Stateful Offline Contextual Policy Evaluation and Learning »
Angela Zhou -
2021 Workshop: Machine Learning Meets Econometrics (MLECON) »
David Bruns-Smith · Arthur Gretton · Limor Gultchin · Niki Kilbertus · Krikamol Muandet · Evan Munro · Angela Zhou -
2021 Poster: Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning »
Aurelien Bibaut · Nathan Kallus · Maria Dimakopoulou · Antoine Chambaz · Mark van der Laan -
2021 Poster: Control Variates for Slate Off-Policy Evaluation »
Nikos Vlassis · Ashok Chandrashekar · Fernando Amat · Nathan Kallus -
2021 Poster: Post-Contextual-Bandit Inference »
Aurelien Bibaut · Maria Dimakopoulou · Nathan Kallus · Antoine Chambaz · Mark van der Laan -
2021 : It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks »
Michelle Bao · Angela Zhou · Samantha Zottola · Brian Brubach · Sarah Desmarais · Aaron Horowitz · Kristian Lum · Suresh Venkatasubramanian -
2020 Workshop: Consequential Decisions in Dynamic Environments »
Niki Kilbertus · Angela Zhou · Ashia Wilson · John Miller · Lily Hu · Lydia T. Liu · Nathan Kallus · Shira Mitchell -
2020 : Spotlight Talk 4: Fairness, Welfare, and Equity in Personalized Pricing »
Nathan Kallus · Angela Zhou -
2020 Poster: Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies »
Nathan Kallus · Masatoshi Uehara -
2019 : Coffee Break and Poster Session »
Rameswar Panda · Prasanna Sattigeri · Kush Varshney · Karthikeyan Natesan Ramamurthy · Harvineet Singh · Vishwali Mhasawade · Shalmali Joshi · Laleh Seyyed-Kalantari · Matthew McDermott · Gal Yona · James Atwood · Hansa Srinivasan · Yonatan Halpern · D. Sculley · Behrouz Babaki · Margarida Carvalho · Josie Williams · Narges Razavian · Haoran Zhang · Amy Lu · Irene Y Chen · Xiaojie Mao · Angela Zhou · Nathan Kallus -
2019 : Opening Remarks »
Thorsten Joachims · Nathan Kallus · Michele Santacatterina · Adith Swaminathan · David Sontag · Angela Zhou -
2019 Workshop: “Do the right thing”: machine learning and causal inference for improved decision making »
Michele Santacatterina · Thorsten Joachims · Nathan Kallus · Adith Swaminathan · David Sontag · Angela Zhou -
2019 : Nathan Kallus: Efficiently Breaking the Curse of Horizon with Double Reinforcement Learning »
Nathan Kallus -
2019 Poster: The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the XAUC Metric »
Nathan Kallus · Angela Zhou -
2019 Poster: Assessing Disparate Impact of Personalized Interventions: Identifiability and Bounds »
Nathan Kallus · Angela Zhou -
2019 Poster: Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning »
Nathan Kallus · Masatoshi Uehara -
2019 Poster: Policy Evaluation with Latent Confounders via Optimal Balance »
Andrew Bennett · Nathan Kallus -
2019 Poster: Deep Generalized Method of Moments for Instrumental Variable Analysis »
Andrew Bennett · Nathan Kallus · Tobias Schnabel -
2018 Workshop: Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy »
Manuela Veloso · Nathan Kallus · Sameena Shah · Senthil Kumar · Isabelle Moulinier · Jiahao Chen · John Paisley -
2018 Poster: Causal Inference with Noisy and Missing Covariates via Matrix Factorization »
Nathan Kallus · Xiaojie Mao · Madeleine Udell -
2018 Poster: Removing Hidden Confounding by Experimental Grounding »
Nathan Kallus · Aahlad Puli · Uri Shalit -
2018 Spotlight: Removing Hidden Confounding by Experimental Grounding »
Nathan Kallus · Aahlad Puli · Uri Shalit -
2018 Poster: Confounding-Robust Policy Improvement »
Nathan Kallus · Angela Zhou -
2018 Poster: Balanced Policy Evaluation and Learning »
Nathan Kallus -
2017 Workshop: From 'What If?' To 'What Next?' : Causal Inference and Machine Learning for Intelligent Decision Making »
Ricardo Silva · Panagiotis Toulis · John Shawe-Taylor · Alexander Volfovsky · Thorsten Joachims · Lihong Li · Nathan Kallus · Adith Swaminathan