NeurIPS $\texttt{PREMIER-TACO}$ is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

$\texttt{PREMIER-TACO}$ is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

Ruijie Zheng · Yongyuan Liang · Xiyao Wang · Shuang Ma · Hal Daumé III · Huazhe Xu · John Langford · Praveen Palanisamy · Kalyan Basu · Furong Huang

Keywords: [ contrastive learning ] [ Reinforcement Learning ] [ pretraining ] [ representation ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: We introduce

Premier-TACO

$\texttt{Premier-TACO}$ , a novel multitask feature representation learning methodology aiming to enhance the efficiency of few-shot policy learning in sequential decision-making tasks.

Premier-TACO

$\texttt{Premier-TACO}$ pretrains a general feature representation using a small subset of relevant multitask offline datasets, capturing essential environmental dynamics. This representation can then be fine-tuned to specific tasks with few expert demonstrations. Building upon the recent temporal action contrastive learning (TACO) objective, which obtains the state of art performance in visual control tasks,

Premier-TACO

$\texttt{Premier-TACO}$ additionally employs a simple yet effective negative example sampling strategy. This key modification ensures computational efficiency and scalability for large-scale multitask offline pretraining. Experimental results from both Deepmind Control Suite and MetaWorld domains underscore the effectiveness of

Premier-TACO

$\texttt{Premier-TACO}$ for pretraining visual representation, facilitating efficient few-shot imitation learning of unseen tasks. On the DeepMind Control Suite,

Premier-TACO

$\texttt{Premier-TACO}$ achieves an average improvement of 101% in comparison to a carefully implemented Learn-from-scratch baseline, and a 24% improvement compared with the most effective baseline pretraining method. Similarly, on MetaWorld,

Premier-TACO

$\texttt{Premier-TACO}$ obtains an average advancement of 74% against Learn-from-scratch and a 40% increase in comparison to the best baseline pretraining method.

Chat is not available.

Poster in Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

PREMIER-TACOPREMIER-TACO\texttt{PREMIER-TACO} is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

Ruijie Zheng · Yongyuan Liang · Xiyao Wang · Shuang Ma · Hal Daumé III · Huazhe Xu · John Langford · Praveen Palanisamy · Kalyan Basu · Furong Huang

Poster
in
Workshop: 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models

$\texttt{PREMIER-TACO}$ is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss