Timezone: »
Five years ago, it took more than a month to train a state-of-the-art image recognition model on the ImageNet dataset. Earlier this year, Facebook demonstrated that such a model could be trained in an hour. However, if we could parallelize this training problem across the world’s fastest supercomputers (~100 PFlops), it would be possible to train the same model in under a minute. This workshop is about closing that gap: how can we turn months into minutes and increase the productivity of machine learning researchers everywhere?
This one-day workshop will facilitate active debate and interaction across many different disciplines. The conversation will range from algorithms to infrastructure to silicon, with invited speakers from Cerebras, DeepMind, Facebook, Google, OpenAI, and other organizations. When should synchronous training be preferred over asynchronous training? Are large batch sizes the key to reach supercomputer scale, or is it possible to fully utilize a supercomputer at batch size one? How important is sparsity in enabling us to scale? Should sparsity patterns be structured or unstructured? To what extent do we expect to customize model architectures for particular problem domains, and to what extent can a “single model architecture” deliver state-of-the-art results across many different domains? How can new hardware architectures unlock even higher real-world training performance?
Our goal is bring people who are trying to answer any of these questions together in hopes that cross pollination will accelerate progress towards deep learning at true supercomputer scale.
Sat 8:10 a.m. - 8:30 a.m.
|
Generalization Gap
(
Presentation
)
|
Nitish Shirish Keskar 🔗 |
Sat 8:30 a.m. - 8:50 a.m.
|
Closing the Generalization Gap
(
Presentation
)
|
Itay Hubara 🔗 |
Sat 8:50 a.m. - 9:10 a.m.
|
Don't Decay the Learning Rate, Increase the Batch Size
(
Presentation
)
|
Sam Smith 🔗 |
Sat 9:10 a.m. - 9:30 a.m.
|
ImageNet In 1 Hour
(
Presentation
)
|
Priya Goyal 🔗 |
Sat 9:30 a.m. - 9:50 a.m.
|
Training with TPUs
(
Presentation
)
|
Chris Ying 🔗 |
Sat 9:50 a.m. - 10:10 a.m.
|
Coffee Break
|
🔗 |
Sat 10:10 a.m. - 10:30 a.m.
|
KFAC and Natural Gradients
(
Presentation
)
|
Matthew Johnson · Daniel Duckworth 🔗 |
Sat 10:30 a.m. - 10:50 a.m.
|
Neumann Optimizer
(
Presentation
)
|
Shankar Krishnan 🔗 |
Sat 10:50 a.m. - 11:10 a.m.
|
Evolutionary Strategies
(
Presentation
)
|
Tim Salimans 🔗 |
Sat 11:15 a.m. - 12:00 p.m.
|
Future Hardware Directions
(
Discussion Panel
)
|
Gregory Diamos · Jeff Dean · Simon Knowles · Michael James · Scott Gray 🔗 |
Sat 1:30 p.m. - 1:50 p.m.
|
Learning Device Placement
(
Presentation
)
|
Azalia Mirhoseini 🔗 |
Sat 1:50 p.m. - 2:10 p.m.
|
Scaling and Sparsity
(
Presentation
)
|
Gregory Diamos 🔗 |
Sat 2:10 p.m. - 2:30 p.m.
|
Small World Network Architectures
(
Presentation
)
|
Scott Gray 🔗 |
Sat 2:30 p.m. - 2:50 p.m.
|
Scalable RL and AlphaGo
(
Presentation
)
|
Timothy Lillicrap 🔗 |
Sat 3:20 p.m. - 3:40 p.m.
|
Scaling Deep Learning to 15 PetaFlops
(
Presentation
)
|
Thorsten Kurth 🔗 |
Sat 3:40 p.m. - 4:00 p.m.
|
Scalable Silicon Compute
(
Presentation
)
|
Simon Knowles 🔗 |
Sat 4:00 p.m. - 4:20 p.m.
|
Practical Scaling Techniques
(
Presentation
)
|
🔗 |
Sat 4:20 p.m. - 4:40 p.m.
|
Designing for Supercompute-Scale Deep Learning
(
Presentation
)
|
Michael James 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Adaptive Memory Networks
(
Poster Session
)
|
Daniel Li 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Supercomputers for Deep Learning
(
Poster Session
)
|
Sreenivas Sukumar 🔗 |
Author Information
Erich Elsen (Google)
Danijar Hafner (Google Brain & UCL)
Zak Stone (Google Brain)
Zak Stone is the Product Manager for TensorFlow on the Google Brain team. He contributes to product strategy, leads the TensorFlow Research Cloud program, and enjoys interacting with TensorFlow's vibrant open-source community. Prior to joining Google, Zak founded a mobile-focused deep learning startup that was acquired by Apple. While at Apple, Zak contributed to the on-device face identification technology in iOS 10 and macOS Sierra that was announced at WWDC 2016.
Brennan Saeta (Google)
More from the Same Authors
-
2021 : Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : Benchmarking the Spectrum of Agent Capabilities »
Danijar Hafner -
2022 : Guiding Exploration Towards Impactful Actions »
Vaibhav Saxena · Jimmy Ba · Danijar Hafner -
2022 : Evaluating Long-Term Memory in 3D Mazes »
Jurgis Pašukonis · Timothy Lillicrap · Danijar Hafner -
2022 : Danijar Hafner »
Danijar Hafner -
2022 Poster: Deep Hierarchical Planning from Pixels »
Danijar Hafner · Kuang-Huei Lee · Ian Fischer · Pieter Abbeel -
2022 Poster: Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : Benchmarking the Spectrum of Agent Capabilities Q&A »
Danijar Hafner -
2021 : Benchmarking the Spectrum of Agent Capabilities »
Danijar Hafner -
2021 Poster: Discovering and Achieving Goals via World Models »
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak -
2021 Poster: Clockwork Variational Autoencoders »
Vaibhav Saxena · Jimmy Ba · Danijar Hafner -
2021 Poster: Information is Power: Intrinsic Control via Information Capture »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2020 : Contributed Talk #2: Evaluating Agents Without Rewards »
Brendon Matusch · Danijar Hafner · Jimmy Ba -
2020 Poster: Top-KAST: Top-K Always Sparse Training »
Siddhant Jayakumar · Razvan Pascanu · Jack Rae · Simon Osindero · Erich Elsen -
2019 : Poster Session »
Matthia Sabatelli · Adam Stooke · Amir Abdi · Paulo Rauber · Leonard Adolphs · Ian Osband · Hardik Meisheri · Karol Kurach · Johannes Ackermann · Matt Benatan · GUO ZHANG · Chen Tessler · Dinghan Shen · Mikayel Samvelyan · Riashat Islam · Murtaza Dalal · Luke Harries · Andrey Kurenkov · Konrad Żołna · Sudeep Dasari · Kristian Hartikainen · Ofir Nachum · Kimin Lee · Markus Holzleitner · Vu Nguyen · Francis Song · Christopher Grimm · Felipe Leno da Silva · Yuping Luo · Yifan Wu · Alex Lee · Thomas Paine · Wei-Yang Qu · Daniel Graves · Yannis Flet-Berliac · Yunhao Tang · Suraj Nair · Matthew Hausknecht · Akhil Bagaria · Simon Schmitt · Bowen Baker · Paavo Parmas · Benjamin Eysenbach · Lisa Lee · Siyu Lin · Daniel Seita · Abhishek Gupta · Riley Simmons-Edler · Yijie Guo · Kevin Corder · Vikash Kumar · Scott Fujimoto · Adam Lerer · Ignasi Clavera Gilaberte · Nicholas Rhinehart · Ashvin Nair · Ge Yang · Lingxiao Wang · Sungryull Sohn · J. Fernando Hernandez-Garcia · Xian Yeow Lee · Rupesh Srivastava · Khimya Khetarpal · Chenjun Xiao · Luckeciano Carvalho Melo · Rishabh Agarwal · Tianhe Yu · Glen Berseth · Devendra Singh Chaplot · Jie Tang · Anirudh Srinivasan · Tharun Kumar Reddy Medini · Aaron Havens · Misha Laskin · Asier Mujika · Rohan Saphal · Joseph Marino · Alex Ray · Joshua Achiam · Ajay Mandlekar · Zhuang Liu · Danijar Hafner · Zhiwen Tang · Ted Xiao · Michael Walton · Jeff Druce · Ferran Alet · Zhang-Wei Hong · Stephanie Chan · Anusha Nagabandi · Hao Liu · Hao Sun · Ge Liu · Dinesh Jayaraman · John Co-Reyes · Sophia Sanborn -
2019 : Contributed Talks »
Jie Tang · Yijie Guo · Danijar Hafner -
2019 : MicroNet Challenge »
Peisong Wang · Cong Leng · Jian Cheng · Zhongxia Yan · Hanrui Wang · Trevor Gale · Erich Elsen -
2019 Poster: Bayesian Layers: A Module for Neural Network Uncertainty »
Dustin Tran · Mike Dusenberry · Mark van der Wilk · Danijar Hafner -
2018 Poster: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion »
Jacob Buckman · Danijar Hafner · George Tucker · Eugene Brevdo · Honglak Lee -
2018 Oral: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion »
Jacob Buckman · Danijar Hafner · George Tucker · Eugene Brevdo · Honglak Lee -
2017 Poster: Learning Hierarchical Information Flow with Recurrent Neural Modules »
Danijar Hafner · Alexander Irpan · James Davidson · Nicolas Heess