Timezone: »

Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation
Alexander Wang · Sasha (Alexandre) Doubov · Gary Leung
Event URL: https://openreview.net/forum?id=k9l1KkV4eQc »

Meta-learning for few-shot classification has been challenged on its effectiveness compared to simpler pretraining methods and the validity of its claim of "learning to learn". Recent work has suggested that MAML-based models do not perform "rapid-learning" in the inner-loop but reuse features by only adapting the final linear layer. Separately, BatchNorm, a near ubiquitous inclusion in model architectures, has been shown to have an implicit learning rate decay effect on the preceding layers of a network. We study the impact of BatchNorm's implicit learning rate decay on feature reuse in meta-learning methods and find that counteracting it increases change in intermediate layers during adaptation. We also find that counteracting this learning rate decay sometimes improves performance on few-shot classification tasks.

Author Information

Alexander Wang (University of Toronto)
Sasha (Alexandre) Doubov (University of Toronto)
Gary Leung (University of Toronto)

More from the Same Authors