Workshop: INTERPOLATE — First Workshop on Interpolation Regularizers and Beyond

Pre-train, fine-tune, interpolate: a three-stage strategy for domain generalization

Alexandre Rame · Jianyu Zhang · Leon Bottou · David Lopez-Paz

Keywords: [ updatable machine learning ] [ Domain generalization ] [ Weight Averaging ] [ transfer learning ]


The goal of domain generalization is to train models that generalize well to unseen domains. To this end, the typical strategy is two-stage: first pre-training the network on a large corpus, then fine-tuning on the task's training domains. If the pre-training dataset is large enough, this pre-training is efficient because it will contain samples related to the unseen domains. Yet, large pre-training is costly and possible only for a few large companies. Rather than trying to cover all kinds of test distributions during pre-training, we propose to add a third stage: editing the featurizer after fine-tuning. To this end, we interpolate the featurizer with auxiliary featurizers trained on auxiliary datasets. This merging via weight averaging edits the main featurizer by including the features mechanisms learned on the auxiliary datasets. Empirically, we show that this editing strategy improves the performance of existing state-of-the-art models on the DomainBed benchmark by adapting the featurizer to the test domain. We hope to encourage updatable approaches beyond the direct transfer learning strategy.

Chat is not available.