NeurIPS ZipIt!: Multitask Model Merging without Training

Spotlight
in
Workshop: UniReps: Unifying Representations in Neural Models

ZipIt!: Multitask Model Merging without Training

George Stoica · Daniel Bolya · Jakob Bjorner · Pratik Ramesh · Taylor Hearn · Judy Hoffman

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: We tackle the extremely difficult problem of combining distinct models with different initializations, each solving a separate task, into one multi-task model

without any additional training

$\textbf{without any additional training}$ . Prior work in model merging permutes one model to the space of the other then averages them together. While this works for models trained on the same task, we find that this fails to account for the differences in models trained on disjoint tasks. Thus, we introduce "ZipIt!", a general method for merging two arbitrary models of the same architecture that incorporates two simple strategies. First, in order to account for features that aren't shared between models, we expand the model merging problem to allow for merging features

within

$\textit{within}$ each model by defining a general "zip" operation. Second, we add support for

partially zipping

$\textit{partially zipping}$ the models up until a specified layer, naturally creating a multi-head model. We find that these two changes combined account for a staggering 20-50% improvement over prior work,

Chat is not available.

Spotlight in Workshop: UniReps: Unifying Representations in Neural Models

ZipIt!: Multitask Model Merging without Training

George Stoica · Daniel Bolya · Jakob Bjorner · Pratik Ramesh · Taylor Hearn · Judy Hoffman

Spotlight
in
Workshop: UniReps: Unifying Representations in Neural Models