(Track3) Offline Reinforcement Learning: From Algorithm Design to Practical Applications

Sergey Levine · Aviral Kumar


Reinforcement learning (RL) provides a mathematical formalism for learning-based control that allows for acquisition of near-optimal behaviors by optimizing user-specified reward functions. While RL methods have received considerable attention recently due to impressive applications in many areas, the fact that RL requires a fundamentally online learning paradigm is one of the biggest obstacles to its widespread adoption. Online interaction is often impractical, because data collection is expensive (e.g., in robotics, or educational agents) or dangerous (e.g., in autonomous driving, or healthcare). An alternate approach is to utilize RL algorithms that effectively leverage previously collected experience without requiring online interaction. This has been referred to as batch RL, offline RL, or data-driven RL. Such algorithms hold tremendous promise for making it possible to turn datasets into powerful decision-making engines, similarly to how datasets have proven key to the success of supervised learning in vision and NLP. In this tutorial, we aim to provide the audience with the conceptual tools needed to both utilize offline RL as a tool, and to conduct research in this exciting area. We aim to provide an understanding of the challenges in offline RL, particularly in the context of modern deep RL methods, and describe some potential solutions that have been explored in recent work, along with applications. We will present classic and recent methods in a way that is accessible for practitioners, and also discuss the theoretical foundations for conducting research in this field. We will conclude with a discussion of open problems.

Chat is not available.