Skip to yearly menu bar Skip to main content


Towards End-to-end 4-Bit Inference on Generative Large Language Models

Saleh Ashkboos · Ilia Markov · Elias Frantar · Tingxuan Zhong · Xincheng Wang · Jie Ren · Torsten Hoefler · Dan Alistarh

Abstract

Chat is not available.