firstbacksecondback
156 Results
Poster
|
Thu 14:00 |
Off-Policy Evaluation with Policy-Dependent Optimization Response Wenshuo Guo · Michael Jordan · Angela Zhou |
|
Poster
|
Wed 9:00 |
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification Takumi Tanabe · Rei Sato · Kazuto Fukuchi · Jun Sakuma · Youhei Akimoto |
|
Poster
|
Wed 14:00 |
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL Benjamin Eysenbach · Alexander Khazatsky · Sergey Levine · Russ Salakhutdinov |
|
Poster
|
A Policy-Guided Imitation Approach for Offline Reinforcement Learning Haoran Xu · Li Jiang · Li Jianxiong · Xianyuan Zhan |
||
Poster
|
Thu 9:00 |
Markovian Interference in Experiments Vivek Farias · Andrew Li · Tianyi Peng · Andrew Zheng |
|
Poster
|
Wed 9:00 |
Policy Gradient With Serial Markov Chain Reasoning Edoardo Cetin · Oya Celiktutan |
|
Poster
|
Wed 9:00 |
Off-Policy Evaluation for Action-Dependent Non-stationary Environments Yash Chandak · Shiv Shankar · Nathaniel Bastian · Bruno da Silva · Emma Brunskill · Philip Thomas |
|
Poster
|
Thu 9:00 |
MoCoDA: Model-based Counterfactual Data Augmentation Silviu Pitis · Elliot Creager · Ajay Mandlekar · Animesh Garg |
|
Poster
|
Thu 9:00 |
A Unifying Framework of Off-Policy General Value Function Evaluation Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin Liang |
|
Poster
|
Wed 9:00 |
Off-Policy Evaluation with Deficient Support Using Side Information Nicolò Felicioni · Maurizio Ferrari Dacrema · Marcello Restelli · Paolo Cremonesi |
|
Poster
|
Thu 14:00 |
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning Yunhao Tang · Remi Munos · Mark Rowland · Bernardo Avila Pires · Will Dabney · Marc Bellemare |
|
Poster
|
Thu 14:00 |
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement Erik Wijmans · Irfan Essa · Dhruv Batra |