Timezone: »
Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks. We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet. By leveraging advances in graph neural networks, we propose a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU. The proposed model achieves surprisingly good performance on unseen and diverse networks. For example, it is able to predict all 24 million parameters of a ResNet-50 achieving a 60% accuracy on CIFAR-10. On ImageNet, top-5 accuracy of some of our networks approaches 50%. Our task along with the model and results can potentially lead to a new, more computationally efficient paradigm of training networks. Our model also learns a strong representation of neural architectures enabling their analysis.
Author Information
Boris Knyazev (University of Guelph / Vector Institute)
Michal Drozdzal (FAIR)
Graham Taylor (University of Guelph / Vector Institute)
Adriana Romero Soriano (Facebook AI Research)
More from the Same Authors
-
2020 : Building LEGO using Deep Generative Models of Graphs »
Rylee Thompson · Graham Taylor · Terrance DeVries · Elahe Ghalebi -
2021 : Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics »
Charan Reddy · Deepak Sharma · Soroush Mehri · Adriana Romero Soriano · Samira Shabanian · Sina Honari -
2021 Spotlight: Instance-Conditioned GAN »
Arantxa Casanova · Marlene Careil · Jakob Verbeek · Michal Drozdzal · Adriana Romero Soriano -
2021 : An Empirical Study of Neural Kernel Bandits »
Michal Lisicki · Arash Afkanpour · Graham Taylor -
2021 : DeepRNG: Towards Deep Reinforcement Learning-Assisted Generative Testing of Software »
Chuan-Yung Tsai · Graham Taylor -
2021 : Neural Structure Mapping For Learning Abstract Visual Analogies »
Shashank Shekhar · Graham Taylor -
2021 Poster: Instance-Conditioned GAN »
Arantxa Casanova · Marlene Careil · Jakob Verbeek · Michal Drozdzal · Adriana Romero Soriano -
2021 Poster: Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning »
Hyunsoo Chung · Jungtaek Kim · Boris Knyazev · Jinhwi Lee · Graham Taylor · Jaesik Park · Minsu Cho -
2021 Poster: Active 3D Shape Reconstruction from Vision and Touch »
Edward Smith · David Meger · Luis Pineda · Roberto Calandra · Jitendra Malik · Adriana Romero Soriano · Michal Drozdzal -
2020 Poster: Instance Selection for GANs »
Terrance DeVries · Michal Drozdzal · Graham Taylor -
2020 Poster: 3D Shape Reconstruction from Vision and Touch »
Edward Smith · Roberto Calandra · Adriana Romero · Georgia Gkioxari · David Meger · Jitendra Malik · Michal Drozdzal -
2020 Session: Orals & Spotlights Track 08: Deep Learning »
Graham Taylor · Mario Lucic -
2019 Workshop: Science meets Engineering of Deep Learning »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 : Welcoming remarks and introduction »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 Poster: Understanding Attention and Generalization in Graph Neural Networks »
Boris Knyazev · Graham Taylor · Mohamed Amer -
2017 : Poster spotlights »
Hiroshi Kuwajima · Masayuki Tanaka · Qingkai Liang · Matthieu Komorowski · Fanyu Que · Thalita F Drumond · Aniruddh Raghu · Leo Anthony Celi · Christina Göpfert · Andrew Ross · Sarah Tan · Rich Caruana · Yin Lou · Devinder Kumar · Graham Taylor · Forough Poursabzi-Sangdeh · Jennifer Wortman Vaughan · Hanna Wallach -
2015 : Learning Multi-scale Temporal Dynamics with Recurrent Neural Networks »
Graham Taylor -
2011 Workshop: Big Learning: Algorithms, Systems, and Tools for Learning at Scale »
Joseph E Gonzalez · Sameer Singh · Graham Taylor · James Bergstra · Alice Zheng · Misha Bilenko · Yucheng Low · Yoshua Bengio · Michael Franklin · Carlos Guestrin · Andrew McCallum · Alexander Smola · Michael Jordan · Sugato Basu -
2011 Poster: Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines »
Matthew D Zeiler · Graham Taylor · Leonid Sigal · Iain Matthews · Rob Fergus -
2010 Poster: Pose-Sensitive Embedding by Nonlinear NCA Regression »
Graham Taylor · Rob Fergus · George Williams · Ian Spiro · Christoph Bregler -
2008 Poster: The Recurrent Temporal Restricted Boltzmann Machine »
Ilya Sutskever · Geoffrey E Hinton · Graham Taylor -
2006 Poster: Modeling Human Motion Using Binary Latent Variables »
Graham Taylor · Geoffrey E Hinton · Sam T Roweis -
2006 Spotlight: Modeling Human Motion Using Binary Latent Variables »
Graham Taylor · Geoffrey E Hinton · Sam T Roweis