Timezone: »
Self-attention and feedforward layers in large-scale Transformer models are overparameterized, limiting inference speed and energy efficiency. Tensor decomposition is a promising technique to reduce parameter redundancy by expressing weight matrices in an efficiently factorized form. Prior efforts used manual or heuristic decomposition settings without hardware-aware customization, resulting in poor hardware efficiencies and large performance degradation.In this work, we propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of tensor decompositions and automates the choice of tensorization shape and decomposition rank with hardware-aware co-optimization. We jointly investigate tensor contraction path optimizations and a fused Einsum mapping strategy to bridge the gap between theoretical benefits and real hardware efficiency improvement. Our two-stage knowledge distillation flow resolves the trainability bottleneck and thus significantly boosts the final accuracy of factorized Transformers. We find that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss and achieve a better efficiency-accuracy Pareto frontier than hand-tuned and heuristic baselines.
Author Information
Jiaqi Gu (The University of Texas at Austin)
Ben Keller (NVIDIA)
Jean Kossaifi (NVIDIA Research)
Anima Anandkumar (NVIDIA / Caltech)
Brucek Khailany (NVIDIA)
David Pan (University of Texas, Austin)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 : HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression »
Sat. Dec 3rd 05:25 -- 05:35 PM Room
More from the Same Authors
-
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2022 Poster: MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training »
De-An Huang · Zhiding Yu · Anima Anandkumar -
2022 : Can you label less by using out-of-domain data? Active & Transfer Learning with Few-shot Instructions »
Rafal Kocielnik · Sara Kangaslahti · Shrimai Prabhumoye · Meena Hari · Michael Alvarez · Anima Anandkumar -
2022 : ZerO Initialization: Initializing Neural Networks with only Zeros and Ones »
Jiawei Zhao · Florian Schaefer · Anima Anandkumar -
2022 : Retrieval-based Controllable Molecule Generation »
Jack Wang · Weili Nie · Zhuoran Qiao · Chaowei Xiao · Richard Baraniuk · Anima Anandkumar -
2022 : Towards Neural Variational Monte Carlo That Scales Linearly with System Size »
Or Sharir · Garnet Chan · Anima Anandkumar -
2022 : Incremental Fourier Neural Operator »
Jiawei Zhao · Robert Joseph George · Yifei Zhang · Zongyi Li · Anima Anandkumar -
2022 : FALCON: Fourier Adaptive Learning and Control for Disturbance Rejection Under Extreme Turbulence »
Sahin Lale · Peter Renn · Kamyar Azizzadenesheli · Babak Hassibi · Morteza Gharib · Anima Anandkumar -
2022 : Fourier Continuation for Exact Derivative Computation in Physics-Informed Neural Operators »
Haydn Maust · Zongyi Li · Yixuan Wang · Anima Anandkumar -
2022 : MoleculeCLIP: Learning Transferable Molecule Multi-Modality Models via Natural Language »
Shengchao Liu · Weili Nie · Chengpeng Wang · Jiarui Lu · Zhuoran Qiao · Ling Liu · Jian Tang · Anima Anandkumar · Chaowei Xiao -
2022 : Fourier Neural Operator for Plasma Modelling »
Vignesh Gopakumar · Stanislas Pamela · Lorenzo Zanisi · Zongyi Li · Anima Anandkumar -
2022 : VIMA: General Robot Manipulation with Multimodal Prompts »
Yunfan Jiang · Agrim Gupta · Zichen Zhang · Guanzhi Wang · Yongqiang Dou · Yanjun Chen · Fei-Fei Li · Anima Anandkumar · Yuke Zhu · Linxi Fan -
2022 : Fast Sampling of Diffusion Models via Operator Learning »
Hongkai Zheng · Weili Nie · Arash Vahdat · Kamyar Azizzadenesheli · Anima Anandkumar -
2022 : DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 : An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design »
Mingjie Liu · Haoyu Yang · David Pan · Brucek Khailany · Mark Ren -
2022 : Contributed Talk: DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 Workshop: Trustworthy and Socially Responsible Machine Learning »
Huan Zhang · Linyi Li · Chaowei Xiao · J. Zico Kolter · Anima Anandkumar · Bo Li -
2022 Spotlight: NeurOLight: A Physics-Agnostic Neural Operator Enabling Parametric Photonic Device Simulation »
Jiaqi Gu · Zhengqi Gao · Chenghao Feng · Hanqing Zhu · Ray Chen · Duane Boning · David Pan -
2022 Workshop: Machine Learning and the Physical Sciences »
Atilim Gunes Baydin · Adji Bousso Dieng · Emine Kucukbenli · Gilles Louppe · Siddharth Mishra-Sharma · Benjamin Nachman · Brian Nord · Savannah Thais · Anima Anandkumar · Kyle Cranmer · Lenka Zdeborová · Rianne van den Berg -
2022 Workshop: AI for Science: Progress and Promises »
Yi Ding · Yuanqi Du · Tianfan Fu · Hanchen Wang · Anima Anandkumar · Yoshua Bengio · Anthony Gitter · Carla Gomes · Aviv Regev · Max Welling · Marinka Zitnik -
2022 Poster: NeurOLight: A Physics-Agnostic Neural Operator Enabling Parametric Photonic Device Simulation »
Jiaqi Gu · Zhengqi Gao · Chenghao Feng · Hanqing Zhu · Ray Chen · Duane Boning · David Pan -
2022 Poster: Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models »
Manli Shu · Weili Nie · De-An Huang · Zhiding Yu · Tom Goldstein · Anima Anandkumar · Chaowei Xiao -
2022 Poster: PeRFception: Perception using Radiance Fields »
Yoonwoo Jeong · Seungjoo Shin · Junha Lee · Chris Choy · Anima Anandkumar · Minsu Cho · Jaesik Park -
2022 Poster: Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits »
Tianyuan Jin · Pan Xu · Xiaokui Xiao · Anima Anandkumar -
2022 Poster: Learning Chaotic Dynamics in Dissipative Systems »
Zongyi Li · Miguel Liu-Schiaffini · Nikola Kovachki · Kamyar Azizzadenesheli · Burigede Liu · Kaushik Bhattacharya · Andrew Stuart · Anima Anandkumar -
2022 Poster: Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models »
Boxin Wang · Wei Ping · Chaowei Xiao · Peng Xu · Mostofa Patwary · Mohammad Shoeybi · Bo Li · Anima Anandkumar · Bryan Catanzaro -
2022 Poster: Pre-Trained Language Models for Interactive Decision-Making »
Shuang Li · Xavier Puig · Chris Paxton · Yilun Du · Clinton Wang · Linxi Fan · Tao Chen · De-An Huang · Ekin Akyürek · Anima Anandkumar · Jacob Andreas · Igor Mordatch · Antonio Torralba · Yuke Zhu -
2022 Poster: MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge »
Linxi Fan · Guanzhi Wang · Yunfan Jiang · Ajay Mandlekar · Yuncong Yang · Haoyi Zhu · Andrew Tang · De-An Huang · Yuke Zhu · Anima Anandkumar -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update »
Jiawei Zhao · Steve Dai · Rangha Venkatesan · Brian Zimmer · Mustafa Ali · Ming-Yu Liu · Brucek Khailany · · Anima Anandkumar -
2021 Poster: AugMax: Adversarial Composition of Random Augmentations for Robust Training »
Haotao Wang · Chaowei Xiao · Jean Kossaifi · Zhiding Yu · Anima Anandkumar · Zhangyang Wang -
2021 Poster: L2ight: Enabling On-Chip Learning for Optical Neural Networks via Efficient in-situ Subspace Optimization »
Jiaqi Gu · Hanqing Zhu · Chenghao Feng · Zixuan Jiang · Ray Chen · David Pan -
2020 : Invited Talk 5: Live Presentation of TensorLy By Jean Kossaifi »
Animashree Anandkumar · Jean Kossaifi -
2020 Poster: Convolutional Tensor-Train LSTM for Spatio-Temporal Learning »
Jiahao Su · Wonmin Byeon · Jean Kossaifi · Furong Huang · Jan Kautz · Anima Anandkumar -
2016 : Anima Anandkumar »
Anima Anandkumar -
2016 Workshop: Learning with Tensors: Why Now and How? »
Anima Anandkumar · Rong Ge · Yan Liu · Maximilian Nickel · Qi (Rose) Yu -
2016 Workshop: Nonconvex Optimization for Machine Learning: Theory and Practice »
Hossein Mobahi · Anima Anandkumar · Percy Liang · Stefanie Jegelka · Anna Choromanska -
2016 Poster: Online and Differentially-Private Tensor Decomposition »
Yining Wang · Anima Anandkumar -
2015 : Opening and Overview »
Anima Anandkumar -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Poster: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2015 Spotlight: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2014 Poster: Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition »
Hanie Sedghi · Anima Anandkumar · Edmond A Jonckheere -
2013 Workshop: Topic Models: Computation, Application, and Evaluation »
David Mimno · Amr Ahmed · Jordan Boyd-Graber · Ankur Moitra · Hanna Wallach · Alexander Smola · David Blei · Anima Anandkumar -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs »
Anima Anandkumar · Ragupathyraj Valluvan -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang