Timezone: »

Offline Reinforcement Learning
Aviral Kumar · Rishabh Agarwal · George Tucker · Lihong Li · Doina Precup · Aviral Kumar

Sat Dec 12 09:00 AM -- 06:00 PM (PST) @ None
Event URL: https://offline-rl-neurips.github.io/ »

The common paradigm in reinforcement learning (RL) assumes that an agent frequently interacts with the environment and learns using its own collected experience. This mode of operation is prohibitive for many complex real-world problems, where repeatedly collecting diverse data is expensive (e.g., robotics or educational agents) and/or dangerous (e.g., healthcare). Alternatively, Offline RL focuses on training agents with logged data in an offline fashion with no further environment interaction. Offline RL promises to bring forward a data-driven RL paradigm and carries the potential to scale up end-to-end learning approaches to real-world decision making tasks such as robotics, recommendation systems, dialogue generation, autonomous driving, healthcare systems and safety-critical applications. Recently, successful deep RL algorithms have been adapted to the offline RL setting and demonstrated a potential for success in a number of domains, however, significant algorithmic and practical challenges remain to be addressed. The goal of this workshop is to bring attention to offline RL, both from within and from outside the RL community discuss algorithmic challenges that need to be addressed, discuss potential real-world applications, discuss limitations and challenges, and come up with concrete problem statements and evaluation protocols, inspired from real-world applications, for the research community to work on.

For details on submission please visit: https://offline-rl-neurips.github.io/ (Submission deadline: October 9, 11:59 pm PT)

Emma Brunskill (Stanford)
Finale Doshi-Velez (Harvard)
John Langford (Microsoft Research)
Nan Jiang (UIUC)
Brandyn White (Waymo Research)
Nando de Freitas (DeepMind)

Sat 8:50 a.m. - 9:00 a.m.
Aviral Kumar, George Tucker, Rishabh Agarwal
Sat 9:00 a.m. - 9:30 a.m.
Offline RL (Talk)   
Nando de Freitas
Sat 9:30 a.m. - 9:40 a.m.
Q&A w/ Nando de Freitas (Q&A)
Sat 9:40 a.m. - 9:50 a.m.

Aayam Shrestha (Oregon State University)*; Stefan Lee (Oregon State University); Prasad Tadepalli (Oregon State University); Alan Fern (Oregon State University)

Aayam Shrestha
Sat 9:50 a.m. - 10:00 a.m.
Contributed Talk 2: Chaining Behaviors from Data with Model-Free Reinforcement Learning (Talk)   
Avi Singh
Sat 10:00 a.m. - 10:10 a.m.
Contributed Talk 3: Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets (Talk)   
Seunghyun Lee, Younggyo Seo, Kimin Lee
Sat 10:10 a.m. - 10:20 a.m.
Contributed Talk 4: Addressing Extrapolation Error in Deep Offline Reinforcement Learning (Talk)
Caglar Gulcehre
Sat 10:20 a.m. - 10:30 a.m.
Q/A for Contributed Talks 1 (Q/A)
Sat 10:30 a.m. - 11:20 a.m.
Poster Session 1 (gather.town) (Poster Session)  link »
Sat 11:20 a.m. - 11:50 a.m.
Causal Structure Discovery in RL (Talk)
John Langford
Sat 11:50 a.m. - 12:00 p.m.
Q&A w/ John Langford (Q&A)
Sat 12:00 p.m. - 1:00 p.m.
Emma Brunskill, Nan Jiang, Nando de Freitas, Finale Doshi-Velez, Sergey Levine, John Langford, Lihong Li, George Tucker, Rishabh Agarwal, Aviral Kumar
Sat 1:10 p.m. - 1:40 p.m.
Learning a Multi-Agent Simulator from Offline Demonstrations (Talk)   
Brandyn White, Brandyn White
Sat 1:40 p.m. - 1:50 p.m.
Q&A w/ Brandyn White (Q&A)
Sat 1:50 p.m. - 2:20 p.m.
Towards Reliable Validation and Evaluation for Offline RL (Talk)   
Nan Jiang
Sat 2:20 p.m. - 2:30 p.m.
Q&A w/ Nan Jiang (Q&A)
Sat 2:30 p.m. - 2:40 p.m.
Contributed Talk 5: Latent Action Space for Offline Reinforcement Learning (Talk)   
Wenxuan Zhou
Sat 2:40 p.m. - 2:50 p.m.
Contributed Talk 6: What are the Statistical Limits for Batch RL with Linear Function Approximation? (Talk)   
Ruosong Wang
Sat 2:50 p.m. - 3:00 p.m.
Contributed Talk 7: Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning (Talk)   
Sam Daulton, Hong Namkoong
Sat 3:00 p.m. - 3:10 p.m.
Contributed Talk 8: Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation (Talk)   
Diksha Garg
Sat 3:10 p.m. - 3:20 p.m.
Q/A for Contributed Talks 2 (Q&A)
Sat 3:20 p.m. - 4:30 p.m.
Poster Session 2 (gather.town) (Poster Session)  link »
Sat 4:30 p.m. - 5:00 p.m.
Counterfactuals and Offline RL (Talk)
Emma Brunskill
Sat 5:00 p.m. - 5:10 p.m.
Q&A w/ Emma Brunskill (Q&A)
Sat 5:10 p.m. - 5:40 p.m.
Batch RL Models Built for Validation (Talk)   
Finale Doshi-Velez
Sat 5:40 p.m. - 5:50 p.m.
Q&A w/ Finale Doshi-Velez (Q&A)
Sat 5:50 p.m. - 6:00 p.m.
Closing Remarks

Author Information

Aviral Kumar (UC Berkeley)
Rishabh Agarwal (Google Research, Brain Team)

I am a researcher in the Google Brain team in Montréal. My research interests mainly revolve around Deep Reinforcement Learning (RL), often with the goal of making RL methods suitable for real-world problems.

George Tucker (Google Brain)
Lihong Li (Google Brain)
Doina Precup (McGill University / Mila / DeepMind Montreal)
Aviral Kumar (UC Berkeley)

More from the Same Authors