Timezone: »

 
Poster
Combining Behaviors with the Successor Features Keyboard
Wilka Carvalho Carvalho · Andre Saraiva · Angelos Filos · Andrew Lampinen · Loic Matthey · Richard L Lewis · Honglak Lee · Satinder Singh · Danilo Jimenez Rezende · Daniel Zoran

Wed Dec 13 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #1913

The Option Keyboard (OK) was recently proposed as a method for transferring behavioral knowledge across tasks. OK transfers knowledge by adaptively combining subsets of known behaviors using Successor Features (SFs) and Generalized Policy Improvement (GPI).However, it relies on hand-designed state-features and task encodings which are cumbersome to design for every new environment.In this work, we propose the "Successor Features Keyboard" (SFK), which enables transfer with discovered state-features and task encodings.To enable discovery, we propose the "Categorical Successor Feature Approximator" (CSFA), a novel learning algorithm for estimating SFs while jointly discovering state-features and task encodings.With SFK and CSFA, we achieve the first demonstration of transfer with SFs in a challenging 3D environment where all the necessary representations are discovered.We first compare CSFA against other methods for approximating SFs and show that only CSFA discovers representations compatible with SF&GPI at this scale.We then compare SFK against transfer learning baselines and show that it transfers most quickly to long-horizon tasks.

Author Information

Wilka Carvalho Carvalho
Andre Saraiva (DeepMind)
Angelos Filos (DeepMind)
Andrew Lampinen (Google DeepMind)
Loic Matthey (DeepMind)
Richard L Lewis (University of Michigan)
Honglak Lee (LG AI Research / U. Michigan)
Satinder Singh (DeepMind)
Danilo Jimenez Rezende (Google DeepMind)
Daniel Zoran (DeepMind)

More from the Same Authors