Timezone: »

 
Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update
Jiawei Zhao · Steve Dai · Rangha Venkatesan · Brian Zimmer · Mustafa Ali · Ming-Yu Liu · Brucek Khailany · · Anima Anandkumar

Mon Dec 13 02:42 PM -- 02:51 PM (PST) @

Representing DNNs with low-precision numbers is a promising approach that enables the efficient acceleration of large-scale deep neural networks (DNNs). However, previous methods typically keep a copy of weights in high precision for weight updates during training. Directly training over low-precision weights still remains an unsolved problem because of the complex interactions between low-precision number systems and the underlying learning algorithms. To address this problem, we develop a low-precision training framework, termed LNS-Madam, in which we jointly design a logarithmic number system (LNS) and a multiplicative weight update training method (Madam). LNS-Madam yields low quantization error during weight update, leading to a stable convergence even if the precision is limited. By replacing SGD or Adam with the Madam optimizer, training under LNS requires less weight precision during the updates while preserving the state-of-the-art prediction accuracy.

Author Information

Jiawei Zhao (Caltech)
Steve Dai (NVIDIA)
Rangha Venkatesan (NVIDIA)
Brian Zimmer (NVIDIA)
Mustafa Ali (Purdue University)
Ming-Yu Liu (Nvidia Research)
Brucek Khailany (NVIDIA)
Anima Anandkumar (NVIDIA / Caltech)

Anima Anandkumar is a Bren professor at Caltech CMS department and a director of machine learning research at NVIDIA. Her research spans both theoretical and practical aspects of large-scale machine learning. In particular, she has spearheaded research in tensor-algebraic methods, non-convex optimization, probabilistic models and deep learning. Anima is the recipient of several awards and honors such as the Bren named chair professorship at Caltech, Alfred. P. Sloan Fellowship, Young investigator awards from the Air Force and Army research offices, Faculty fellowships from Microsoft, Google and Adobe, and several best paper awards. Anima received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, a visiting researcher at Microsoft Research New England in 2012 and 2014, an assistant professor at U.C. Irvine between 2010 and 2016, an associate professor at U.C. Irvine between 2016 and 2017 and a principal scientist at Amazon Web Services between 2016 and 2018.

More from the Same Authors