firstbacksecondback
620 Results
Workshop
|
Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction Jiachen Li · Shuo Cheng · Zhenyu Liao · Huayan Wang · William Yang Wang · Qinxun Bai |
||
Workshop
|
Efficient Multi-Horizon Learning for Off-Policy Reinforcement Learning Raja Farrukh Ali · Nasik Muhammad Nafi · Kevin Duong · William Hsu |
||
Poster
|
Thu 14:00 |
Action-modulated midbrain dopamine activity arises from distributed control policies Jack Lindsey · Ashok Litwin-Kumar |
|
Poster
|
Tue 9:00 |
The Pitfalls of Regularization in Off-Policy TD Learning Gaurav Manek · J. Zico Kolter |
|
Poster
|
Thu 9:00 |
Markovian Interference in Experiments Vivek Farias · Andrew Li · Tianyi Peng · Andrew Zheng |
|
Workshop
|
AsymQ: Asymmetric Q-loss to mitigate overestimation bias in off-policy reinforcement learning Qinsheng Zhang · Arjun Krishna · Sehoon Ha · Yongxin Chen |
||
Workshop
|
Variance Reduction in Off-Policy Deep Reinforcement Learning using Spectral Normalization Payal Bawa · Rafael Oliveira · Fabio Ramos |
||
Poster
|
Thu 14:00 |
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning Yunhao Tang · Remi Munos · Mark Rowland · Bernardo Avila Pires · Will Dabney · Marc Bellemare |
|
Poster
|
Wed 9:00 |
Policy Gradient With Serial Markov Chain Reasoning Edoardo Cetin · Oya Celiktutan |
|
Poster
|
Wed 9:00 |
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification Takumi Tanabe · Rei Sato · Kazuto Fukuchi · Jun Sakuma · Youhei Akimoto |
|
Poster
|
Wed 9:00 |
Off-Policy Evaluation for Action-Dependent Non-stationary Environments Yash Chandak · Shiv Shankar · Nathaniel Bastian · Bruno da Silva · Emma Brunskill · Philip Thomas |
|
Poster
|
Thu 9:00 |
MoCoDA: Model-based Counterfactual Data Augmentation Silviu Pitis · Elliot Creager · Ajay Mandlekar · Animesh Garg |