Timezone: »
Recent evidence has shown the existence of a so-called double-descent and even triple-descent behavior for the generalization error of deep-learning models. This important phenomenon commonly appears in implemented neural network architectures, and also seems to emerge in epoch-wise curves during the training process. A recent line of research has highlighted that random matrix tools can be used to obtain precise analytical asymptotics of the generalization (and training) errors of the random feature model. In this contribution, we analyze the whole temporal behavior of the generalization and training errors under gradient flow for the random feature model. We show that in the asymptotic limit of large system size the full time-evolution path of both errors can be calculated analytically. This allows us to observe how the double and triple descents develop over time, if and when early stopping is an option, and also observe time-wise descent structures. Our techniques are based on Cauchy complex integral representations of the errors together with recent random matrix methods based on linear pencils.
Author Information
Antoine Bodin (École polytechnique fédérale de Lausanne (EPFL))
Nicolas Macris (EPFL)
More from the Same Authors
-
2020 Poster: Information theoretic limits of learning a sparse rule »
Clément Luneau · jean barbier · Nicolas Macris -
2020 Poster: All-or-nothing statistical and computational phase transitions in sparse spiked matrix estimation »
jean barbier · Nicolas Macris · Cynthia Rush -
2020 Spotlight: Information theoretic limits of learning a sparse rule »
Clément Luneau · jean barbier · Nicolas Macris -
2018 Poster: Entropy and mutual information in models of deep neural networks »
Marylou Gabrié · Andre Manoel · Clément Luneau · jean barbier · Nicolas Macris · Florent Krzakala · Lenka Zdeborová -
2018 Poster: The committee machine: Computational to statistical gaps in learning a two-layers neural network »
Benjamin Aubin · Antoine Maillard · jean barbier · Florent Krzakala · Nicolas Macris · Lenka Zdeborová -
2018 Spotlight: The committee machine: Computational to statistical gaps in learning a two-layers neural network »
Benjamin Aubin · Antoine Maillard · jean barbier · Florent Krzakala · Nicolas Macris · Lenka Zdeborová -
2018 Spotlight: Entropy and mutual information in models of deep neural networks »
Marylou Gabrié · Andre Manoel · Clément Luneau · jean barbier · Nicolas Macris · Florent Krzakala · Lenka Zdeborová -
2016 Poster: Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula »
jean barbier · Mohamad Dia · Nicolas Macris · Florent Krzakala · Thibault Lesieur · Lenka Zdeborová