Skip to yearly menu bar Skip to main content


Towards End-to-end 4-Bit Inference on Generative Large Language Models

Saleh Ashkboos ⋅ Ilia Markov ⋅ Elias Frantar ⋅ Tingxuan Zhong ⋅ Xincheng Wang ⋅ Jie Ren ⋅ Torsten Hoefler ⋅ Dan Alistarh

Abstract

Chat is not available.