firstbacksecondback
30 Results
Poster
|
Wed 9:00 |
Policy Gradient With Serial Markov Chain Reasoning Edoardo Cetin · Oya Celiktutan |
|
Poster
|
Wed 9:00 |
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification Takumi Tanabe · Rei Sato · Kazuto Fukuchi · Jun Sakuma · Youhei Akimoto |
|
Workshop
|
MOPA: a Minimalist Off-Policy Approach to Safe-RL Hao Sun · Ziping Xu · Zhenghao Peng · Meng Fang · Bo Dai · Bolei Zhou |
||
Poster
|
Tue 9:00 |
The Pitfalls of Regularization in Off-Policy TD Learning Gaurav Manek · J. Zico Kolter |
|
Poster
|
Wed 9:00 |
Off-Policy Evaluation for Action-Dependent Non-stationary Environments Yash Chandak · Shiv Shankar · Nathaniel Bastian · Bruno da Silva · Emma Brunskill · Philip Thomas |
|
Poster
|
Wed 9:00 |
Off-Policy Evaluation with Deficient Support Using Side Information Nicolò Felicioni · Maurizio Ferrari Dacrema · Marcello Restelli · Paolo Cremonesi |
|
Poster
|
Thu 9:00 |
MoCoDA: Model-based Counterfactual Data Augmentation Silviu Pitis · Elliot Creager · Ajay Mandlekar · Animesh Garg |
|
Poster
|
Thu 9:00 |
Conformal Off-Policy Prediction in Contextual Bandits Muhammad Faaiz Taufiq · Jean-Francois Ton · Rob Cornish · Yee Whye Teh · Arnaud Doucet |
|
Poster
|
On the role of overparameterization in off-policy Temporal Difference learning with linear function approximation Valentin Thomas |
||
Poster
|
Thu 9:00 |
A Unifying Framework of Off-Policy General Value Function Evaluation Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin Liang |
|
Poster
|
Tue 14:00 |
Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models Rui Miao · Zhengling Qi · Xiaoke Zhang |
|
Poster
|
Wed 14:00 |
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP Fan Chen · Junyu Zhang · Zaiwen Wen |