Timezone: »

Learning Wake-Sleep Recurrent Attention Models
Jimmy Ba · Russ Salakhutdinov · Roger Grosse · Brendan J Frey

Wed Dec 09 04:00 PM -- 08:59 PM (PST) @ 210 C #14 #None

Despite their success, convolutional neural networks are computationally expensive because they must examine all image locations. Stochastic attention-based models have been shown to improve computational efficiency at test time, but they remain difficult to train because of intractable posterior inference and high variance in the stochastic gradient estimates. Borrowing techniques from the literature on training deep generative models, we present the Wake-Sleep Recurrent Attention Model, a method for training stochastic attention networks which improves posterior inference and which reduces the variability in the stochastic gradients. We show that our method can greatly speed up the training time for stochastic attention networks in the domains of image classification and caption generation.

Author Information

Jimmy Ba (University of Toronto)
Russ Salakhutdinov (University of Toronto)
Roger Grosse (University of Toronto)
Brendan J Frey (U. Toronto)

Brendan Frey is Co-Founder and CEO of Deep Genomics, a Co-Founder of the Vector Institute for Artificial Intelligence, and a Professor of Engineering and Medicine at the University of Toronto. He is internationally recognized as a leader in machine learning and in genome biology and his group has published over a dozen papers on these topics in Science, Nature and Cell. His work on using deep learning to identify protein-DNA interactions was recently highlighted on the front cover Nature Biotechnology (2015), while his work on deep learning dates back to an early paper on what are now called variational autoencoders (Science 1995). He is a Fellow of the Royal Society of Canada, a Fellow of the Institute for Electrical and Electronic Engineers, and a Fellow of the American Association for the Advancement of Science. He has consulted for several industrial research and development laboratories in Canada, the United States and England, and has served on the Technical Advisory Board of Microsoft Research.

More from the Same Authors