Timezone: »

In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Vitaly Kurin · Alessandro De Palma · Ilya Kostrikov · Shimon Whiteson · Pawan K Mudigonda

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #306

Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.

Author Information

Vitaly Kurin (University of Oxford)
Alessandro De Palma (University of Oxford)

PhD student in Autonomous Intelligent Machines and Systems at University of Oxford

Ilya Kostrikov (University of California Berkeley)
Shimon Whiteson (Oxford University)
Pawan K Mudigonda (University of Oxford)

More from the Same Authors