Recent developments in few-shot learning have shown that during fast adaption, gradient-based meta-learners mostly rely on embedding features of powerful pretrained networks. This leads us to research ways to effectively adapt features and utilize the meta-learner's full potential. Here, we demonstrate the effectiveness of hypernetworks in this context. We propose a soft row-sharing hypernetwork architecture and show that training the hypernetwork with a variant of MAML is tightly linked to meta-learning a curvature matrix used to condition gradients during fast adaptation. We achieve similar results as state-of-art model-agnostic methods in the overparametrized case, while outperforming many MAML variants without using different optimization schemes in the compressive regime. Furthermore, we empirically show that hypernetworks do leverage the inner loop optimization for better adaptation, and analyse how they naturally try to learn the shared curvature of constructed tasks on a toy problem when using our proposed training algorithm.
Dominic Zhao (ETH Zurich)
More from the Same Authors
2022 Poster: A contrastive rule for meta-learning »
Nicolas Zucchet · Simon Schug · Johannes von Oswald · Dominic Zhao · João Sacramento
2021 Poster: Learning where to learn: Gradient sparsity in meta and continual learning »
Johannes von Oswald · Dominic Zhao · Seijin Kobayashi · Simon Schug · Massimo Caccia · Nicolas Zucchet · João Sacramento