Poster
|
Tue 14:00
|
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin · Mart van Baalen · Yuwei Ren · Markus Nagel · Jorn Peters · Tijmen Blankevoort
|
|
Workshop
|
|
Quantization-aware Policy Distillation (QPD)
Thomas Avé · Kevin Mets · Tom De Schepper · Steven Latre
|
|
Poster
|
|
Leveraging Inter-Layer Dependency for Post -Training Quantization
changbao wang · DanDan Zheng · Yuanliu Liu · Liang Li
|
|
Poster
|
Tue 9:00
|
XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient
Xiaoxia Wu · Zhewei Yao · Minjia Zhang · Conglong Li · Yuxiong He
|
|
Poster
|
Thu 14:00
|
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Zhewei Yao · Reza Yazdani Aminabadi · Minjia Zhang · Xiaoxia Wu · Conglong Li · Yuxiong He
|
|
Workshop
|
|
Post-Training Neural Network Compression With Variational Bayesian Quantization
Zipei Tan · Robert Bamler
|
|
Workshop
|
Fri 3:50
|
Post-Training Neural Network Compression With Variational Bayesian Quantization
Zipei Tan · Robert Bamler
|
|
Poster
|
Thu 9:00
|
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar · Dan Alistarh
|
|
Workshop
|
Fri 8:30
|
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Song Han
|
|
Poster
|
Thu 9:00
|
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai · Lu Hou · Lifeng Shang · Xin Jiang · Irwin King · Michael R Lyu
|
|
Poster
|
|
TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training
Chang Chen · Min Li · Zhihua Wu · Dianhai Yu · Chao Yang
|
|
Poster
|
|
SAPipe: Staleness-Aware Pipeline for Data Parallel DNN Training
Yangrui Chen · Cong Xie · Meng Ma · Juncheng Gu · Yanghua Peng · Haibin Lin · Chuan Wu · Yibo Zhu
|
|