Skip to yearly menu bar Skip to main content

Workshop: Goal-Conditioned Reinforcement Learning

Using Proto-Value Functions for Curriculum Generation in Goal-Conditioned RL

Henrik Metternich · Ahmed Hendway · Pascal Klink · Jan Peters · Carlo DEramo

Keywords: [ curriculum learning ] [ Reinforcement Learning ] [ Graph Laplacian ]


In this paper, we investigate the use of Proto Value Functions (PVFs) for measuring the similarity between tasks in the context of Curriculum Learning (CL). PVFs serve as a mathematical framework for generating basis functions for the state space of a Markov Decision Process (MDP). They capture the structure of the state space manifold and have been shown to be useful for value function approximation in Reinforcement Learning (RL). We show that even a few PVFs allow us to estimate the similarity between tasks. Based on this observation, we introduce a new algorithm called Curriculum Representation Policy Iteration (CRPI) that uses PVFs for CL, and we provide a proof of concept in a Goal-Conditioned Reinforcement Learning (GCRL) setting.

Chat is not available.