Timezone: »

 
Learning Successor Feature Representations to Train Robust Policies for Multi-task Learning
Melissa Mozifian · Dieter Fox · David Meger · Fabio Ramos · Animesh Garg
Event URL: https://openreview.net/forum?id=iYJFH9D-oKl »

The deep reinforcement learning (RL) framework has shown great promise to tackle sequential decision-making problems, where the agent learns to behave optimally through interactions with the environment and receiving rewards. The ability of an RL agent to learn different reward functions concurrently has many benefits, such as the decomposition of task rewards and promoting skill reuse. In this paper, we consider the problem of continuous control for robot manipulation tasks with an explicit representation that promotes skill reuse while learning multiple tasks with similar reward functions. Our approach relies on two key concepts: successor features (SFs), a value function representation that decouples the dynamics of the environment from the rewards, and an actor-critic framework that incorporates the learned SFs representation.SFs form a natural bridge between model-based and model-free RL methods. We first show how to learn a decomposable representation required by SFs as a pre-training stage. The proposed architecture is able to learn decoupled state and reward feature representations for non-linear reward functions. We then evaluate the feasibility of integrating SFs into an actor-critic framework, which is more tailored for tasks solved with deep RL algorithms. The approach is empirically tested on non-trivial continuous control problems with compositional structure built into the reward functions of the tasks.

Author Information

Melissa Mozifian (Mila)
Dieter Fox (University of Washington)
David Meger (McGill University)
Fabio Ramos (University of Sydney, NVIDIA)
Animesh Garg (University of Toronto, Nvidia, Vector Institute)

I am a CIFAR AI Chair Assistant Professor of Computer Science at the University of Toronto, a Faculty Member at the Vector Institute, and Sr. Researcher at Nvidia. My current research focuses on machine learning for perception and control in robotics.

More from the Same Authors