Uni[MASK]: Unified Inference in Sequential Decision Problems

Micah Carroll · Orr Paradise · Jessy Lin · Raluca Georgescu · Mingfei Sun · David Bignell · Stephanie Milani · Katja Hofmann · Matthew Hausknecht · Anca Dragan · Sam Devlin

Hall J #107

Keywords: [ Multi-task Learning ] [ Unsupervised Learning ] [ Reinforcement Learning ] [ Deep Learning ]

[ Abstract ]
[ Poster [ OpenReview
Wed 30 Nov 9 a.m. PST — 11 a.m. PST


Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.

Chat is not available.