Ultra-Low Precision 4-bit Training of Deep Neural Networks

Xiao Sun, Naigang Wang, Chia-Yu Chen, Jiamin Ni, Ankur Agrawal, Xiaodong Cui, Swagath Venkataramani, Kaoutar El Maghraoui, Vijayalakshmi (Viji) Srinivasan, Kailash Gopalakrishnan

Oral presentation: Orals & Spotlights Track 18: Deep Learning
on 2020-12-09T06:00:00-08:00 - 2020-12-09T06:15:00-08:00
Poster Session 4 (more posters)
on 2020-12-09T09:00:00-08:00 - 2020-12-09T11:00:00-08:00
GatherTown: Deep learning ( Town C1 - Spot B0 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Abstract: In this paper, we propose a number of novel techniques and numerical representation formats that enable, for the very first time, the precision of training systems to be aggressively scaled from 8-bits to 4-bits. To enable this advance, we explore a novel adaptive Gradient Scaling technique (Gradscale) that addresses the challenges of insufficient range and resolution in quantized gradients as well as explores the impact of quantization errors observed during model training. We theoretically analyze the role of bias in gradient quantization and propose solutions that mitigate the impact of this bias on model convergence. Finally, we examine our techniques on a spectrum of deep learning models in computer vision, speech, and NLP. In combination with previously proposed solutions for 4-bit quantization of weight and activation tensors, 4-bit training shows a non-significant loss in accuracy across application domains while enabling significant hardware acceleration (> 7X over state-of-the-art FP16 systems).

Preview Video and Chat

Chat is not available.