Poster
Dimension-Free Bounds for Low-Precision Training
Zheng Li · Christopher De Sa
East Exhibition Hall B, C #159
Keywords: [ Convex Optimization ] [ Optimization ] [ Non-Convex Optimization ]
[
Abstract
]
Abstract:
Low-precision training is a promising way of decreasing the time and energy cost of training machine learning models.
Previous work has analyzed low-precision training algorithms, such as low-precision stochastic gradient descent, and derived theoretical bounds on their convergence rates.
These bounds tend to depend on the dimension of the model $d$ in that the number of bits needed to achieve a particular error bound increases as $d$ increases.
In this paper, we derive new bounds for low-precision training algorithms that do not contain the dimension $d$ , which lets us better understand what affects the convergence of these algorithms as parameters scale.
Our methods also generalize naturally to let us prove new convergence bounds on low-precision training with other quantization schemes, such as low-precision floating-point computation and logarithmic quantization.
Live content is unavailable. Log in and register to view live content