Timezone: »
Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the model's representation of the data. First, we show that common regularization techniques uniformly decrease the last-layer ID (LLID) of validation set activations for image classification models and show how this strongly affects model generalization performance. We also investigate how excessive regularization decreases a model's ability to extract features from data in earlier layers, leading to a negative effect on validation accuracy even while LLID continues to decrease and training accuracy remains near-perfect. Finally, we examine the LLID over the course of training of models that exhibit grokking. We observe that well after training accuracy saturates, when models ``grok'' and validation accuracy suddenly improves from random to perfect, there is a co-occurent sudden drop in LLID, thus providing more insight into the dynamics of sudden generalization.
Author Information
Bradley Brown (University of Waterloo)
Jordan Juravsky (University of Waterloo)
Anthony Caterini (Layer 6 AI / University of Oxford)
Gabriel Loaiza-Ganem (Layer 6 AI)
More from the Same Authors
-
2021 : Entropic Issues in Likelihood-Based OOD Detection »
Anthony Caterini · Gabriel Loaiza-Ganem -
2021 : Entropic Issues in Likelihood-Based OOD Detection »
Anthony Caterini · Gabriel Loaiza-Ganem -
2022 : CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds »
Jesse Cresswell · Brendan Ross · Gabriel Loaiza-Ganem · Humberto Reyes-Gonzalez · Marco Letizia · Anthony Caterini -
2022 : Relating Regularization and Generalization through the Intrinsic Dimension of Activations »
Bradley Brown · Jordan Juravsky · Anthony Caterini · Gabriel Loaiza-Ganem -
2022 : The Union of Manifolds Hypothesis »
Bradley Brown · Anthony Caterini · Brendan Ross · Jesse Cresswell · Gabriel Loaiza-Ganem -
2022 : Denoising Deep Generative Models »
Gabriel Loaiza-Ganem · Brendan Ross · Luhuan Wu · John Cunningham · Jesse Cresswell · Anthony Caterini -
2023 : TabPFGen – Tabular Data Generation with TabPFN »
Jeremy (Junwei) Ma · Apoorv Dankar · George Stein · Guangwei Yu · Anthony Caterini -
2023 : TabPFGen – Tabular Data Generation with TabPFN »
Jeremy (Junwei) Ma · Apoorv Dankar · George Stein · Guangwei Yu · Anthony Caterini -
2023 Poster: Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models »
George Stein · Jesse Cresswell · Rasa Hosseinzadeh · Yi Sui · Brendan Ross · Valentin Villecroze · Zhaoyan Liu · Anthony Caterini · Eric Taylor · Gabriel Loaiza-Ganem -
2022 : Poster Session 1 »
Andrew Lowy · Thomas Bonnier · Yiling Xie · Guy Kornowski · Simon Schug · Seungyub Han · Nicolas Loizou · xinwei zhang · Laurent Condat · Tabea E. Röber · Si Yi Meng · Marco Mondelli · Runlong Zhou · Eshaan Nichani · Adrian Goldwaser · Rudrajit Das · Kayhan Behdin · Atish Agarwala · Mukul Gagrani · Gary Cheng · Tian Li · Haoran Sun · Hossein Taheri · Allen Liu · Siqi Zhang · Dmitrii Avdiukhin · Bradley Brown · Miaolan Xie · Junhyung Lyle Kim · Sharan Vaswani · Xinmeng Huang · Ganesh Ramachandra Kini · Angela Yuan · Weiqiang Zheng · Jiajin Li -
2022 : Spotlight 5 - Gabriel Loaiza-Ganem: Denoising Deep Generative Models »
Gabriel Loaiza-Ganem -
2021 Poster: Rectangular Flows for Manifold Learning »
Anthony Caterini · Gabriel Loaiza-Ganem · Geoff Pleiss · John Cunningham -
2020 Poster: Invertible Gaussian Reparameterization: Revisiting the Gumbel-Softmax »
Andres Potapczynski · Gabriel Loaiza-Ganem · John Cunningham -
2019 Poster: Deep Random Splines for Point Process Intensity Estimation of Neural Population Data »
Gabriel Loaiza-Ganem · Sean Perkins · Karen Schroeder · Mark Churchland · John Cunningham -
2019 Poster: The continuous Bernoulli: fixing a pervasive error in variational autoencoders »
Gabriel Loaiza-Ganem · John Cunningham