Timezone: »

 
Poster
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Jack Rae · Jonathan J Hunt · Ivo Danihelka · Tim Harley · Andrew Senior · Gregory Wayne · Alex Graves · Timothy Lillicrap

Wed Dec 07 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #17 #None
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows --- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs $1,\!000\times$ faster and with $3,\!000\times$ less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\!000$s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.

Author Information

Jack Rae (Google DeepMind)
Jonathan J Hunt (Brain Corporation)
Ivo Danihelka (DeepMind)
Tim Harley (Google DeepMind)
Andrew Senior (DeepMind)
Greg Wayne (Google DeepMind)
Alex Graves (Google DeepMind)

Main contributions to neural networks include the Connectionist Temporal Classification training algorithm (widely used for speech, handwriting and gesture recognition, e.g. by Google voice search), a type of differentiable attention for RNNs (originally for handwriting generation, now a standard tool in computer vision, machine translation and elsewhere), stochastic gradient variational inference, and Neural Turing Machines. He works at Google Deep Mind.

Timothy Lillicrap (Google DeepMind)

More from the Same Authors