Skip to yearly menu bar Skip to main content

Workshop: Workshop on Distribution Shifts: New Frontiers with Foundation Models

Continually Adapting Optimizers Improve Meta-Generalization

Wenyi Wang · Louis Kirsch · Francesco Faccio · Mingchen Zhuge · J├╝rgen Schmidhuber

Keywords: [ meta learning ] [ generalization ] [ Learned Optimizer ] [ adaptation ]


Meta-learned optimizers increasingly outperform analytical handcrafted optimizers such as SGD and Adam. On some tasks, however, they fail to generalize strongly, underperforming handcrafted methods. Then one can fall back on handcrafted methods through a guard, to combine the efficiency benefits of learned optimizers and the guarantees of analytical methods. At some point in the iterative optimization process, however, such guards may make the learned optimizer incompatible with the remaining optimization, and thus useless for further progress. Our novel method Meta Guard keeps adapting the learned optimizer to the target optimization problem. It experimentally outperforms other baselines, adapting to new tasks during training.

Chat is not available.