Timezone: »
Panel
Panel 2B-4: Extreme Compression for… & Exploring Length Generalization…
Cem Anil · Minjia Zhang
Author Information
Cem Anil (University of Toronto)
I'm a first year PhD student at the University of Toronto and Vector Institute, supervised by Roger Grosse and Geoffrey Hinton.
Minjia Zhang (Microsoft)
More from the Same Authors
-
2023 : Interactive Panel Discussion »
Tanya Roosta · Tim Dettmers · Minjia Zhang · Nazneen Rajani -
2022 Spotlight: ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers »
Zhewei Yao · Reza Yazdani Aminabadi · Minjia Zhang · Xiaoxia Wu · Conglong Li · Yuxiong He -
2022 Spotlight: Lightning Talks 5B-2 »
Conglong Li · Mohammad Azizmalayeri · Mojan Javaheripi · Pratik Vaishnavi · Jon Hasselgren · Hao Lu · Kevin Eykholt · Arshia Soltani Moakhar · Wenze Liu · Gustavo de Rosa · Nikolai Hofmann · Minjia Zhang · Zixuan Ye · Jacob Munkberg · Amir Rahmati · Arman Zarei · Subhabrata Mukherjee · Yuxiong He · Shital Shah · Reihaneh Zohrabi · Hongtao Fu · Tomasz Religa · Yuliang Liu · Mohammad Manzuri · Mohammad Hossein Rohban · Zhiguo Cao · Caio Cesar Teodoro Mendes · Sebastien Bubeck · Farinaz Koushanfar · Debadeepta Dey -
2022 Spotlight: The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models »
Conglong Li · Minjia Zhang · Yuxiong He -
2022 Poster: ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers »
Zhewei Yao · Reza Yazdani Aminabadi · Minjia Zhang · Xiaoxia Wu · Conglong Li · Yuxiong He -
2022 Poster: The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models »
Conglong Li · Minjia Zhang · Yuxiong He -
2022 Poster: Exploring Length Generalization in Large Language Models »
Cem Anil · Yuhuai Wu · Anders Andreassen · Aitor Lewkowycz · Vedant Misra · Vinay Ramasesh · Ambrose Slone · Guy Gur-Ari · Ethan Dyer · Behnam Neyshabur -
2022 Poster: Solving Quantitative Reasoning Problems with Language Models »
Aitor Lewkowycz · Anders Andreassen · David Dohan · Ethan Dyer · Henryk Michalewski · Vinay Ramasesh · Ambrose Slone · Cem Anil · Imanol Schlag · Theo Gutman-Solo · Yuhuai Wu · Behnam Neyshabur · Guy Gur-Ari · Vedant Misra -
2022 Poster: XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient »
Xiaoxia Wu · Zhewei Yao · Minjia Zhang · Conglong Li · Yuxiong He -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2021 Poster: NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM »
Connor Holmes · Minjia Zhang · Yuxiong He · Bo Wu -
2021 Poster: Learning to Elect »
Cem Anil · Xuchan Bao -
2020 Poster: HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory »
Jie Ren · Minjia Zhang · Dong Li -
2020 Poster: Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping »
Minjia Zhang · Yuxiong He -
2020 Poster: AdaTune: Adaptive Tensor Program Compilation Made Efficient »
Menghao Li · Minjia Zhang · Chi Wang · Mingqin Li -
2019 Poster: Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks »
Qiyang Li · Saminul Haque · Cem Anil · James Lucas · Roger Grosse · Joern-Henrik Jacobsen -
2018 : TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer »
Sicong (Sheldon) Huang · Cem Anil · Xuchan Bao -
2018 Poster: Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models »
Minjia Zhang · Wenhan Wang · Xiaodong Liu · Jianfeng Gao · Yuxiong He