Timezone: »
Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score [1] — a common baseline for newer metrics — and propose shrinkage-based estimator. This results in up to 80% absolute gain in H-score correlation performance, making it competitive with the state-of-the-art LogME measure by [26]. Our shrinkage-based H-score is 3−55 times faster than LogME. Additionally, we look into a less common setting of target (as opposed to source) task selection. We highlight previously overlooked problems in such settings with different number of labels, class-imbalance ratios etc. for some recent metrics e.g., NCE [24], LEEP [18] that misrepresented them as leading measures. We propose a correction and recommend measuring correlation performance against relative accuracy in such settings. We support our findings with ~65,000 (fine-tuning trials) experiments.
Author Information
Shibal Ibrahim (Massachusetts Institute of Technology)
Natalia Ponomareva (Google)
Rahul Mazumder (MIT)
More from the Same Authors
-
2022 : Network Pruning at Scale: A Discrete Optimization Approach »
Wenyu Chen · Riade Benbaki · Xiang Meng · Rahul Mazumder -
2022 : A Light-speed Linear Program Solver for Personalized Recommendation with Diversity Constraints »
Miao Cheng · Haoyue Wang · Aman Gupta · Rahul Mazumder · Sathiya Selvaraj · Kinjal Basu -
2022 : Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization »
Kayhan Behdin · Qingquan Song · Aman Gupta · Sathiya Selvaraj · David Durfee · Ayan Acharya · Rahul Mazumder -
2022 Poster: Pushing the limits of fairness impossibility: Who's the fairest of them all? »
Brian Hsu · Rahul Mazumder · Preetam Nandy · Kinjal Basu -
2021 Poster: DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning »
Hussein Hazimeh · Zhe Zhao · Aakanksha Chowdhery · Maheswaran Sathiamoorthy · Yihua Chen · Rahul Mazumder · Lichan Hong · Ed Chi -
2021 Poster: Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data »
Qi Zhu · Natalia Ponomareva · Jiawei Han · Bryan Perozzi