Timezone: »
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning. We begin by defining the problem of learning from confounded expert data in a contextual MDP setup. We analyze the limitations of learning from such data with and without external reward and propose an adjustment of standard imitation learning algorithms to fit this setup. In addition, we discuss the problem of distribution shift between the expert data and the online environment when partial observability is present in the data. We prove possibility and impossibility results for imitation learning under arbitrary distribution shift of the missing covariates. When additional external reward is provided, we propose a sampling procedure that addresses the unknown shift and prove convergence to an optimal solution. Finally, we validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
Author Information
Guy Tennenholtz (Technion, Technion)
Assaf Hallak (The Technion)
Gal Dalal (NVIDIA)
Shie Mannor (Technion)
Gal Chechik (Nvidia)
Uri Shalit (Technion)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Dates n/a. Room
More from the Same Authors
-
2021 : Bandits with Partially Observable Confounded Data »
Guy Tennenholtz · Uri Shalit · Shie Mannor · Yonathan Efroni -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Guy Tennenholtz · Assaf Hallak · Gal Dalal · Shie Mannor · Gal Chechik · Uri Shalit -
2021 : Latent Geodesics of Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Nir Baram · Shie Mannor -
2021 : Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning »
Roy Zohar · Shie Mannor · Guy Tennenholtz -
2021 : Latent Geodesics of Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Nir Baram · Shie Mannor -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds »
Yoav Wald · Gal Yona · Uri Shalit · Yair Carmon -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : SoftTreeMax: Policy Gradient with Tree Search »
Gal Dalal · Assaf Hallak · Shie Mannor · Gal Chechik -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 Poster: Scalable Sensitivity and Uncertainty Analyses for Causal-Effect Estimates of Continuous-Valued Interventions »
Andrew Jesson · Alyson Douglas · Peter Manshausen · Maëlys Solal · Nicolai Meinshausen · Philip Stier · Yarin Gal · Uri Shalit -
2022 Poster: Reinforcement Learning with a Terminator »
Guy Tennenholtz · Nadav Merlis · Lior Shani · Shie Mannor · Uri Shalit · Gal Chechik · Assaf Hallak · Gal Dalal -
2022 Poster: Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Shie Mannor -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning (Guy Tennenholtz) »
Guy Tennenholtz -
2021 : Uri Shalit - Calibration, out-of-distribution generalization and a path towards causal representations »
Uri Shalit -
2021 Poster: Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction »
Gal Dalal · Assaf Hallak · Steven Dalton · iuri frosio · Shie Mannor · Gal Chechik -
2021 Poster: Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data »
Andrew Jesson · Panagiotis Tigas · Joost van Amersfoort · Andreas Kirsch · Uri Shalit · Yarin Gal -
2021 Poster: On Calibration and Out-of-Domain Generalization »
Yoav Wald · Amir Feder · Daniel Greenfeld · Uri Shalit -
2020 : Mini-panel discussion 2 - Real World RL: An industry perspective »
Franziska Meier · Gabriel Dulac-Arnold · Shie Mannor · Timothy A Mann -
2020 Workshop: The Challenges of Real World Reinforcement Learning »
Daniel Mankowitz · Gabriel Dulac-Arnold · Shie Mannor · Omer Gottesman · Anusha Nagabandi · Doina Precup · Timothy A Mann · Gabriel Dulac-Arnold -
2020 Poster: Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models »
Andrew Jesson · Sören Mindermann · Uri Shalit · Yarin Gal -
2020 Poster: A causal view of compositional zero-shot recognition »
Yuval Atzmon · Felix Kreuk · Uri Shalit · Gal Chechik -
2020 Spotlight: A causal view of compositional zero-shot recognition »
Yuval Atzmon · Felix Kreuk · Uri Shalit · Gal Chechik -
2020 Poster: Online Planning with Lookahead Policies »
Yonathan Efroni · Mohammad Ghavamzadeh · Shie Mannor -
2019 Poster: Distributional Policy Optimization: An Alternative Approach for Continuous Control »
Chen Tessler · Guy Tennenholtz · Shie Mannor -
2019 Poster: Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning »
Chao Qu · Shie Mannor · Huan Xu · Yuan Qi · Le Song · Junwu Xiong -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Hierarchical RL: From Prior Knowledge to Policies (Shie Mannor) »
Shie Mannor -
2018 Poster: Removing Hidden Confounding by Experimental Grounding »
Nathan Kallus · Aahlad Puli · Uri Shalit -
2018 Spotlight: Removing Hidden Confounding by Experimental Grounding »
Nathan Kallus · Aahlad Puli · Uri Shalit -
2018 Poster: Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning »
Tom Zahavy · Matan Haroush · Nadav Merlis · Daniel J Mankowitz · Shie Mannor -
2017 Workshop: Hierarchical Reinforcement Learning »
Andrew G Barto · Doina Precup · Shie Mannor · Tom Schaul · Roy Fox · Carlos Florensa -
2017 Workshop: Machine Learning for Health (ML4H) - What Parts of Healthcare are Ripe for Disruption by Machine Learning Right Now? »
Jason Fries · Alex Wiltschko · Andrew Beam · Isaac S Kohane · Jasper Snoek · Peter Schulam · Madalina Fiterau · David Kale · Rajesh Ranganath · Bruno Jedynak · Michael Hughes · Tristan Naumann · Natalia Antropova · Adrian Dalca · SHUBHI ASTHANA · Prateek Tandon · Jaz Kandola · Uri Shalit · Marzyeh Ghassemi · Tim Althoff · Alexander Ratner · Jumana Dakka -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris Mooij · David Sontag · Richard Zemel · Max Welling -
2016 Workshop: Machine Learning for Health »
Uri Shalit · Marzyeh Ghassemi · Jason Fries · Rajesh Ranganath · Theofanis Karaletsos · David Kale · Peter Schulam · Madalina Fiterau -
2010 Spotlight: Online Learning in The Manifold of Low-Rank Matrices »
Uri Shalit · Daphna Weinshall · Gal Chechik -
2010 Poster: Online Learning in The Manifold of Low-Rank Matrices »
Uri Shalit · Daphna Weinshall · Gal Chechik -
2009 Poster: An Online Algorithm for Large Scale Image Similarity Learning »
Gal Chechik · Uri Shalit · Varun Sharma · Samy Bengio