Skip to yearly menu bar Skip to main content

Poster - Recorded Presentation
Workshop: Machine Learning for Systems

Lattice Quantization

ClĂ©ment Metz · Thibault Allenet · Johannes Thiele · Antoine DUPRET · Olivier BICHLER


Post-training quantization of neural networks consists in quantizing a model without retraining, which is user-friendly, fast and data frugal. In this paper, we propose LatticeQ, a novel post-training weight quantization method designed for deep convolutional neural networks. Contrary to scalar rounding widely used in state-of-the-art quantization methods, LatticeQ uses a quantizer based on lattices -- discrete algebraic structures. LatticeQ exploits the inner correlations between the model parameters to the benefit of minimizing quantization error. This allows to achieve state-of-the-art results in post-training quantization. In particular, we demonstrate ImageNet classification results close to full precision on the popular Resnet-18/50, with little to no accuracy drop for 4-bit models.

Chat is not available.