Skip to yearly menu bar Skip to main content

Workshop: Goal-Conditioned Reinforcement Learning

Does Hierarchical Reinforcement Learning Outperform Standard Reinforcement Learning in Goal-Oriented Environments?

Ziyan Luo · Yijie Zhang · Zhaoyue(Rebecca) Wang

Keywords: [ Goal-oriented ] [ GCRL ] [ Temporal Abstraction ] [ HRL ]


Hierarchical Reinforcement Learning (HRL) targets long-horizon decision-making problems by decomposing the task into a hierarchy of subtasks. There is a plethora of HRL works that can do bottom-up temporal abstraction automatically meanwhile learning a hierarchical policy. In this study, we assess performance of standard RL and HRL within a customizable 2D Minecraft domain with varying difficulty levels. We observed that without a-prior knowledge, predefined subgoal structures and well-shaped reward structures, HRL methods surprisingly do not outperform all standard RL methods in 2D Minecraft domain.We also provide clues to elucidate the underlying reasons for this outcome, e.g., whether HRL methods, incorporating automatic temporal abstraction, can discover bottom-up action abstractions that match the intrinsic top-down task decomposition, often referred to as "goal-directed behavior" in goal-oriented environments.

Chat is not available.