Skip to yearly menu bar Skip to main content


Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

Jiachen Li ⋅ Shuo Cheng ⋅ Zhenyu Liao ⋅ Huayan Wang ⋅ William Yang Wang ⋅ Qinxun Bai

Abstract

Video

Chat is not available.