NeurIPS Using Proto-Value Functions for Curriculum Generation in Goal-Conditioned RL

Poster
in
Workshop: Goal-Conditioned Reinforcement Learning

Using Proto-Value Functions for Curriculum Generation in Goal-Conditioned RL

Henrik Metternich · Ahmed Hendway · Pascal Klink · Jan Peters · Carlo DEramo

Keywords: [ Graph Laplacian ] [ Reinforcement Learning ] [ curriculum learning ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

In this paper, we investigate the use of Proto Value Functions (PVFs) for measuring the similarity between tasks in the context of Curriculum Learning (CL). PVFs serve as a mathematical framework for generating basis functions for the state space of a Markov Decision Process (MDP). They capture the structure of the state space manifold and have been shown to be useful for value function approximation in Reinforcement Learning (RL). We show that even a few PVFs allow us to estimate the similarity between tasks. Based on this observation, we introduce a new algorithm called Curriculum Representation Policy Iteration (CRPI) that uses PVFs for CL, and we provide a proof of concept in a Goal-Conditioned Reinforcement Learning (GCRL) setting.

Chat is not available.

Poster in Workshop: Goal-Conditioned Reinforcement Learning

Using Proto-Value Functions for Curriculum Generation in Goal-Conditioned RL

Henrik Metternich · Ahmed Hendway · Pascal Klink · Jan Peters · Carlo DEramo

Poster
in
Workshop: Goal-Conditioned Reinforcement Learning