Skip to yearly menu bar Skip to main content

Workshop: Goal-Conditioned Reinforcement Learning

Bi-Directional Goal-Conditioning on Single Value Function for State Space Search Problems

Vihaan Akshaay Rajendiran · Yu-Xiang Wang · Lei Li

Keywords: [ goal-conditioning ] [ Search Algorithms ] [ Deep Reinforcement Learning ] [ Search Problems ]

Abstract: State space search problems have a binary (found/not found) reward system. In our work, we assume the ability to sample goal states and use the same to define a forward task $(\tau^*)$ and a backward task $(\tau^{inv})$ derived from the original state space search task to ensure more useful and learnable samples. Similar to Hindsight Relabelling, we define 'Foresight Relabelling' for reverse trajectories. We also use the agent's ability (from the policy function) to evaluate the reachability of intermediate states and use these states as goals for new sub-tasks. We group these tasks and sample generation strategies and make a single policy function (DQN) using goal-conditioning to learn all these different tasks and call it 'SRE-DQN’ (Scrambler-Resolver-Explorer). Finally, we demonstrate the advantages of bi-directional goal-conditioning and knowledge of the goal state by evaluating our framework on classical goal-reaching tasks, and comparing with existing concepts extended to our bi-directional setting.

Chat is not available.