Poster
in
Workshop: Deep Reinforcement Learning

Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay

Iryna Korshunova · Minqi Jiang · Jack Parker-Holder · Tim Rocktäschel · Edward Grefenstette

Project Page [ OpenReview]

Abstract

Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels. PLR selectively samples training levels weighed by a function of recent temporal-difference (TD) errors experienced on each level. We explore the dispersion of returns as an alternative prioritization criterion to address certain issues with TD error scores.

Video

Chat is not available.