NIPS Poster Training Deep Models Faster with Robust, Approximate Importance Sampling

Poster

Training Deep Models Faster with Robust, Approximate Importance Sampling

Tyler Johnson · Carlos Guestrin

Room 210 #58

Keywords: [ Optimization for Deep Networks ] [ Efficient Training Methods ]

[ Abstract ]

Abstract:

In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples. In practice, the cost of computing importances greatly limits the impact of importance sampling. We propose a robust, approximate importance sampling procedure (RAIS) for stochastic gradient de- scent. By approximating the ideal sampling distribution using robust optimization, RAIS provides much of the benefit of exact importance sampling with drastically reduced overhead. Empirically, we find RAIS-SGD and standard SGD follow similar learning curves, but RAIS moves faster through these paths, achieving speed-ups of at least 20% and sometimes much more.

Live content is unavailable. Log in and register to view live content