Task Matrices: Linear Maps for Cross-Model Finetuning Transfer across Modalities
Abstract
Results in interpretability suggest that large vision and language models developimplicit linearities in pretrained settings. Learned linear encodings have beendocumented in in-context learning settings, where model predictions are biasedat runtime. However, it is unclear whether similar linear representations exists inmore generalized adaptation regimes. In this work, we develop the concept of atask matrix, a linear transformation from a base to finetuned embedding state. Wedemonstrate that for CLIP, DEiT, allMiniLM-V2, and RoBERTa, a base modelaugmented with a task matrix approaches finetuned accuracies on certain datasets,while resulting in marginal improvements on others. Our results demonstratethat over a range of models, modalities, and tasks, linear encoding in transformerembedding spaces exist not only between layers in a single model architecture, butalso between pretrained and finetuned architectures.