Timezone: »
Offline RL is an important step towards making data-hungry RL algorithms more widely usable in the real world, but conventional assumptions on the distribution of logging data do not apply in some key real-world scenarios. In particular, it is unrealistic to assume that RL practitioners will have access to sets of trajectories that simultaneously are mutually independent and explore well. We propose two natural ways to relax these assumptions: by allowing the data to be distributed according to different logging policies independently, and by allowing logging policies to depend on past trajectories. We discuss Offline Policy Evaluation (OPE) in these settings, analyzing the performance of a model-based OPE estimator when the MDP is tabular.
Author Information
Sunil Madhow (, University of California, Santa Barbara)
Dan Qiao (UCSB)
Yu-Xiang Wang (UC Santa Barbara)
More from the Same Authors
-
2021 : Instance-dependent Offline Reinforcement Learning: From tabular RL to linear MDPs »
Ming Yin · Yu-Xiang Wang -
2022 : Generalized PTR: User-Friendly Recipes for Data-Adaptive Algorithms with Differential Privacy »
Rachel Redberg · Yuqing Zhu · Yu-Xiang Wang -
2022 : VOTING-BASED APPROACHES FOR DIFFERENTIALLY PRIVATE FEDERATED LEARNING »
Yuqing Zhu · Xiang Yu · Yi-Hsuan Tsai · Francesco Pittaluga · Masoud Faraki · Manmohan Chandraker · Yu-Xiang Wang -
2022 : Offline Reinforcement Learning with Closed-Form Policy Improvement Operators »
Jiachen Li · Edwin Zhang · Ming Yin · Qinxun Bai · Yu-Xiang Wang · William Yang Wang -
2022 : Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation »
Dan Qiao · Yu-Xiang Wang -
2022 : Differentially Private Gradient Boosting on Linear Learners for Tabular Data »
Saeyoung Rho · Shuai Tang · Sergul Aydore · Michael Kearns · Aaron Roth · Yu-Xiang Wang · Steven Wu · Cedric Archambeau -
2022 : Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2023 Poster: Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2023 Poster: Offline Reinforcement Learning with Differential Privacy »
Dan Qiao · Yu-Xiang Wang -
2023 Poster: Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation »
Lijing Kuang · Ming Yin · Mengdi Wang · Yu-Xiang Wang · Yian Ma -
2023 Poster: Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms »
Dheeraj Baby · Saurabh Garg · Tzu-Ching Yen · Sivaraman Balakrishnan · Zachary Lipton · Yu-Xiang Wang -
2023 Poster: Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners »
Rachel Redberg · Antti Koskela · Yu-Xiang Wang -
2023 Poster: A Privacy-Friendly Approach to Data Valuation »
Jiachen T. Wang · Yuqing Zhu · Yu-Xiang Wang · Ruoxi Jia · Prateek Mittal -
2022 : Contributed Talk: Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2022 : Panel on Privacy and Security in Machine Learning Systems »
Graham Cormode · Borja Balle · Yu-Xiang Wang · Alejandro Saucedo · Neil Lawrence -
2022 : Practical differential privacy »
Yu-Xiang Wang · Fariba Yousefi -
2022 : Practical differential privacy »
Yu-Xiang Wang -
2022 Poster: SeqPATE: Differentially Private Text Generation via Knowledge Distillation »
Zhiliang Tian · Yingxiu Zhao · Ziyue Huang · Yu-Xiang Wang · Nevin L. Zhang · He He -
2022 Poster: Differentially Private Linear Sketches: Efficient Implementations and Applications »
Fuheng Zhao · Dan Qiao · Rachel Redberg · Divyakant Agrawal · Amr El Abbadi · Yu-Xiang Wang -
2022 Poster: Optimal Dynamic Regret in LQR Control »
Dheeraj Baby · Yu-Xiang Wang -
2021 Workshop: Privacy in Machine Learning (PriML) 2021 »
Yu-Xiang Wang · Borja Balle · Giovanni Cherubin · Kamalika Chaudhuri · Antti Honkela · Jonathan Lebensold · Casey Meehan · Mi Jung Park · Adrian Weller · Yuqing Zhu -
2020 Workshop: Privacy Preserving Machine Learning - PriML and PPML Joint Edition »
Borja Balle · James Bell · AurĂ©lien Bellet · Kamalika Chaudhuri · Adria Gascon · Antti Honkela · Antti Koskela · Casey Meehan · Olga Ohrimenko · Mi Jung Park · Mariana Raykova · Mary Anne Smart · Yu-Xiang Wang · Adrian Weller