Timezone: »
Off-policy evaluation provides an essential tool for evaluating the effects of different policies or treatments using only observed data. When applied to high-stakes scenarios such as medical diagnosis or financial decision-making, it is essential to provide provably correct upper and lower bounds of the expected reward, not just a classical single point estimate, to the end-users, as executing a poor policy can be very costly. In this work, we propose a provably correct method for obtaining interval bounds for off-policy evaluation in a general continuous setting. The idea is to search for the maximum and minimum values of the expected reward among all the Lipschitz Q-functions that are consistent with the observations, which amounts to solving a constrained optimization problem on a Lipschitz function space. We go on to introduce a Lipschitz value iteration method to monotonically tighten the interval, which is simple yet efficient and provably convergent. We demonstrate the practical efficiency of our method on a range of benchmarks.
Author Information
Ziyang Tang (UT Austin)
Yihao Feng (UT Austin)
I am a Ph.D student at UT Austin, where I work on Reinforcement Learning and Approximate Inference. I am looking for internships for summer 2020! Please feel free to contact me (yihao AT cs.utexas.edu) if you have open positions!
Na Zhang (Tsinghua University)
Jian Peng (University of Illinois at Urbana-Champaign)
Qiang Liu (UT Austin)
More from the Same Authors
-
2021 : Imitation Learning from Observations under Transition Model Disparity »
Tanmay Gangwani · Yuan Zhou · Jian Peng -
2021 : Hindsight Foresight Relabeling for Meta-Reinforcement Learning »
Michael Wan · Jian Peng · Tanmay Gangwani -
2022 Poster: Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation »
Zhizhou Ren · Anji Liu · Yitao Liang · Jian Peng · Jianzhu Ma -
2022 Poster: Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures »
Shitong Luo · Yufeng Su · Xingang Peng · Sheng Wang · Jian Peng · Jianzhu Ma -
2023 Poster: Equivariant Neural Operator Learning with Graphon Convolution »
Chaoran Cheng · Jian Peng -
2023 Poster: LinkerNet: Fragment Poses and Linker Co-Design with 3D Equivariant Diffusion »
Jiaqi Guan · Xingang Peng · PeiQi Jiang · Yunan Luo · Jian Peng · Jianzhu Ma -
2021 Poster: A 3D Generative Model for Structure-Based Drug Design »
Shitong Luo · Jiaqi Guan · Jianzhu Ma · Jian Peng -
2020 Poster: Stein Self-Repulsive Dynamics: Benefits From Past Samples »
Mao Ye · Tongzheng Ren · Qiang Liu -
2020 Poster: Black-Box Certification with Randomized Smoothing: A Functional Optimization Based Framework »
Dinghuai Zhang · Mao Ye · Chengyue Gong · Zhanxing Zhu · Qiang Liu -
2020 Poster: Certified Monotonic Neural Networks »
Xingchao Liu · Xing Han · Na Zhang · Qiang Liu -
2020 Poster: Learning Guidance Rewards with Trajectory-space Smoothing »
Tanmay Gangwani · Yuan Zhou · Jian Peng -
2020 Spotlight: Certified Monotonic Neural Networks »
Xingchao Liu · Xing Han · Na Zhang · Qiang Liu -
2020 Poster: Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks »
Lemeng Wu · Bo Liu · Peter Stone · Qiang Liu -
2020 Poster: Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough »
Mao Ye · Lemeng Wu · Qiang Liu -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2019 Poster: A Kernel Loss for Solving the Bellman Equation »
Yihao Feng · Lihong Li · Qiang Liu -
2019 Poster: Splitting Steepest Descent for Growing Neural Architectures »
Lemeng Wu · Dilin Wang · Qiang Liu -
2019 Spotlight: Splitting Steepest Descent for Growing Neural Architectures »
Lemeng Wu · Dilin Wang · Qiang Liu -
2019 Poster: Stein Variational Gradient Descent With Matrix-Valued Kernels »
Dilin Wang · Ziyang Tang · Chandrajit Bajaj · Qiang Liu -
2019 Poster: Thresholding Bandit with Optimal Aggregate Regret »
Chao Tao · Saúl Blanco · Jian Peng · Yuan Zhou -
2019 Poster: Exploration via Hindsight Goal Generation »
Zhizhou Ren · Kefan Dong · Yuan Zhou · Qiang Liu · Jian Peng -
2018 Poster: Variational Inference with Tail-adaptive f-Divergence »
Dilin Wang · Hao Liu · Qiang Liu -
2018 Oral: Variational Inference with Tail-adaptive f-Divergence »
Dilin Wang · Hao Liu · Qiang Liu -
2018 Poster: Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation »
Qiang Liu · Lihong Li · Ziyang Tang · Denny Zhou -
2018 Spotlight: Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation »
Qiang Liu · Lihong Li · Ziyang Tang · Denny Zhou -
2018 Poster: Stein Variational Gradient Descent as Moment Matching »
Qiang Liu · Dilin Wang