Timezone: »
We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We study this alignment effect by investigating neural networks pre-trained on randomly labelled image data and subsequently fine-tuned on disjoint datasets with random or real labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling. We analyze how competing effects, such as specialization at later layers, may hide the positive transfer. These effects are studied in several network architectures, including VGG16 and ResNet18, on CIFAR10 and ImageNet.
Author Information
Hartmut Maennel (Google)
Ibrahim Alabdulmohsin (Google Research)
Ilya Tolstikhin (Google, Brain Team, Zurich)
Robert Baldock (Google)
Olivier Bousquet (Google Brain (Zurich))
Sylvain Gelly (Google Brain (Zurich))
Daniel Keysers (Google Research, Brain Team)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Poster: What Do Neural Networks Learn When Trained With Random Labels? »
Tue Dec 8th 05:00 -- 07:00 PM Room Poster Session 1
More from the Same Authors
-
2020 Memorial: In Memory of Olivier Chapelle »
Bernhard Schölkopf · Andre Elisseeff · Olivier Bousquet · Vladimir Vapnik · Jason E Weston -
2020 Poster: Synthetic Data Generators -- Sequential and Private »
Olivier Bousquet · Roi Livni · Shay Moran -
2019 Poster: Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates »
Carlos Riquelme · Hugo Penedones · Damien Vincent · Hartmut Maennel · Sylvain Gelly · Timothy A Mann · Andre Barreto · Gergely Neu -
2019 Poster: Practical and Consistent Estimation of f-Divergences »
Paul Rubenstein · Olivier Bousquet · Josip Djolonga · Carlos Riquelme · Ilya Tolstikhin -
2018 Poster: Assessing Generative Models via Precision and Recall »
Mehdi S. M. Sajjadi · Olivier Bachem · Mario Lucic · Olivier Bousquet · Sylvain Gelly -
2018 Poster: Are GANs Created Equal? A Large-Scale Study »
Mario Lucic · Karol Kurach · Marcin Michalski · Sylvain Gelly · Olivier Bousquet -
2017 Workshop: Optimal Transport and Machine Learning »
Olivier Bousquet · Marco Cuturi · Gabriel Peyré · Fei Sha · Justin Solomon -
2017 Poster: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Spotlight: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Poster: AdaGAN: Boosting Generative Models »
Ilya Tolstikhin · Sylvain Gelly · Olivier Bousquet · Carl-Johann SIMON-GABRIEL · Bernhard Schölkopf -
2016 Poster: Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels »
Ilya Tolstikhin · Bharath Sriperumbudur · Bernhard Schölkopf -
2016 Poster: Consistent Kernel Mean Estimation for Functions of Random Variables »
Carl-Johann Simon-Gabriel · Adam Scibior · Ilya Tolstikhin · Bernhard Schölkopf -
2013 Poster: PAC-Bayes-Empirical-Bernstein Inequality »
Ilya Tolstikhin · Yevgeny Seldin -
2013 Spotlight: PAC-Bayes-Empirical-Bernstein Inequality »
Ilya Tolstikhin · Yevgeny Seldin -
2007 Poster: The Tradeoffs of Large Scale Learning »
Leon Bottou · Olivier Bousquet -
2006 Demonstration: MoGo: exploration-exploitation in Monte-Carlo Go using UCT and patterns »
Olivier Teytaud · Sylvain Gelly