Timezone: »

Temporary Goals for Exploration
Haoyang Xu · Jimmy Ba · Silviu Pitis · Harris Chan
Event URL: https://openreview.net/forum?id=viSa_CFEgYQ »

Exploration has always been a crucial aspect of reinforcement learning. When facing long horizon sparse reward environments modern methods still struggle with effective exploration and generalize poorly. In the multi-goal reinforcement learning setting, out-of-distribution goals might appear similar to the achieved ones, but the agent can never accurately assess its ability to achieve them without attempting them. To enable faster exploration and improve generalization, we propose an exploration method that lets the agent temporarily pursue the most meaningful nearby goal. We demonstrate the performance of our method through experiments in four multi-goal continuous navigation environments including a 2D PointMaze, an AntMaze, and a discrete multi-goal foraging world.

Author Information

Haoyang Xu (University of Toronto)
Jimmy Ba (University of Toronto / Vector Institute)
Silviu Pitis (University of Toronto)
Harris Chan (University of Toronto, Vector Institute)

More from the Same Authors