Timezone: »
The Tensor Programs framework has produced a series of powerful results by 1) expressing any deep learning computation of concern as a principled composition of element-wise nonlinearities and matrix multiplication, and 2) inductively reasoning about the program behavior as the sizes of the matrices in the program tend to infinity. For example, this framework helped to show that infinitely wide neural networks exhibit Gaussian process behavior at initialization and evolve like a kernel model during training in the so-called NTK parameterization (Yang, 2019b, 2020a; Yang and Littwin, 2021). Moreover, this framework yielded a novel parameterization, coined μP (Yang and Hu, 2021), that for the first time enabled hyperparameter tuning for enormous networks too expensive to train more than once (Yang et al., 2022). However, this framework has so far been limited to Gaussian initialized weights, while uniform or truncated Gaussian distributions are more prevalent in practice. This work extends Tensor Programs to general non-Gaussian weights, thus recovering all of the above results in all practical settings.
Author Information
Eugene Golikov (École polytechnique fédérale de Lausanne)
MSc in Fluid Mechanics @ Moscow SU MSc in Computer Science @ HSE Doing a PhD in DL theory @ EPFL
Greg Yang (Microsoft Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Non-Gaussian Tensor Programs »
Thu. Dec 1st 05:00 -- 07:00 PM Room Hall J #535
More from the Same Authors
-
2021 : Few-Shot Learning Evaluation in Natural Language Understanding »
Subhabrata Mukherjee · Xiaodong Liu · Guoqing Zheng · Saghar Hosseini · Hao Cheng · Ge Yang · Christopher Meek · Ahmed Awadallah · Jianfeng Gao -
2022 Spotlight: Lightning Talks 1B-2 »
Eugene Golikov · Nils M. Kriege · Qing Xiu · Kai Han · Greg Yang · Jing Tang · Shuang Cui · He Huang -
2022 Poster: Feature Learning in $L_2$-regularized DNNs: Attraction/Repulsion and Sparsity »
Arthur Jacot · Eugene Golikov · Clement Hongler · Franck Gabriel -
2022 Poster: High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation »
Jimmy Ba · Murat Erdogdu · Taiji Suzuki · Zhichao Wang · Denny Wu · Greg Yang -
2022 Poster: 3DB: A Framework for Debugging Computer Vision Models »
Guillaume Leclerc · Hadi Salman · Andrew Ilyas · Sai Vemprala · Logan Engstrom · Vibhav Vineet · Kai Xiao · Pengchuan Zhang · Shibani Santurkar · Greg Yang · Ashish Kapoor · Aleksander Madry -
2021 Poster: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer »
Ge Yang · Edward Hu · Igor Babuschkin · Szymon Sidor · Xiaodong Liu · David Farhi · Nick Ryder · Jakub Pachocki · Weizhu Chen · Jianfeng Gao -
2020 Poster: Denoised Smoothing: A Provable Defense for Pretrained Classifiers »
Hadi Salman · Mingjie Sun · Greg Yang · Ashish Kapoor · J. Zico Kolter -
2019 Poster: A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks »
Hadi Salman · Greg Yang · Huan Zhang · Cho-Jui Hsieh · Pengchuan Zhang -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Poster: Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes »
Greg Yang -
2018 : Poster Session 1 »
Stefan Gadatsch · Danil Kuzin · Navneet Kumar · Patrick Dallaire · Tom Ryder · Remus-Petru Pop · Nathan Hunt · Adam Kortylewski · Sophie Burkhardt · Mahmoud Elnaggar · Dieterich Lawson · Yifeng Li · Jongha (Jon) Ryu · Juhan Bae · Micha Livne · Tim Pearce · Mariia Vladimirova · Jason Ramapuram · Jiaming Zeng · Xinyu Hu · Jiawei He · Danielle Maddix · Arunesh Mittal · Albert Shaw · Tuan Anh Le · Alexander Sagel · Lisha Chen · Victor Gallego · Mahdi Karami · Zihao Zhang · Tal Kachman · Noah Weber · Matt Benatan · Kumar K Sricharan · Vincent Cartillier · Ivan Ovinnikov · Buu Phan · Mahmoud Hossam · Liu Ziyin · Valerii Kharitonov · Eugene Golikov · Qiang Zhang · Jae Myung Kim · Sebastian Farquhar · Jishnu Mukhoti · Xu Hu · Gregory Gundersen · Lavanya Sita Tekumalla · Paris Perdikaris · Ershad Banijamali · Siddhartha Jain · Ge Liu · Martin Gottwald · Katy Blumer · Sukmin Yun · Ranganath Krishnan · Roman Novak · Yilun Du · Yu Gong · Beliz Gokkaya · Jessica Ai · Daniel Duckworth · Johannes von Oswald · Christian Henning · Louis-Philippe Morency · Ali Ghodsi · Mahesh Subedar · Jean-Pascal Pfister · Rémi Lebret · Chao Ma · Aleksander Wieczorek · Laurence Perreault Levasseur -
2017 : Poster Session »
Tsz Kit Lau · Johannes Maly · Nicolas Loizou · Christian Kroer · Yuan Yao · Youngsuk Park · Reka Agnes Kovacs · Dong Yin · Vlad Zhukov · Woosang Lim · David Barmherzig · Dimitris Metaxas · Bin Shi · Rajan Udwani · William Brendel · Yi Zhou · Vladimir Braverman · Sijia Liu · Eugene Golikov -
2017 Poster: Mean Field Residual Networks: On the Edge of Chaos »
Ge Yang · Samuel Schoenholz