Poster
Training Deep Models Faster with Robust, Approximate Importance Sampling
Tyler Johnson · Carlos Guestrin
Room 210 #58
Keywords: [ Optimization for Deep Networks ] [ Efficient Training Methods ]
In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples. In practice, the cost of computing importances greatly limits the impact of importance sampling. We propose a robust, approximate importance sampling procedure (RAIS) for stochastic gradient de- scent. By approximating the ideal sampling distribution using robust optimization, RAIS provides much of the benefit of exact importance sampling with drastically reduced overhead. Empirically, we find RAIS-SGD and standard SGD follow similar learning curves, but RAIS moves faster through these paths, achieving speed-ups of at least 20% and sometimes much more.
Live content is unavailable. Log in and register to view live content