Poster
in
Workshop: Deep Reinforcement Learning
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay
Iryna Korshunova · Minqi Jiang · Jack Parker-Holder · Tim Rocktäschel · Edward Grefenstette
Abstract:
Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels. PLR selectively samples training levels weighed by a function of recent temporal-difference (TD) errors experienced on each level. We explore the dispersion of returns as an alternative prioritization criterion to address certain issues with TD error scores.
Chat is not available.
Successful Page Load