Timezone: »
We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner purely by interacting with an environment in reinforcement learning setting. The network builds an internal plan, which is continuously updated upon observation of the next input from the environment. It can also partition this internal representation into contiguous sub-sequences by learning for how long the plan can be committed to -- i.e. followed without replaning. Combining these properties, the proposed model, dubbed STRategic Attentive Writer (STRAW) can learn high-level, temporally abstracted macro-actions of varying lengths that are solely learnt from data without any prior information. These macro-actions enable both structured exploration and economic computation. We experimentally demonstrate that STRAW delivers strong improvements on several ATARI games by employing temporally extended planning strategies (e.g. Ms. Pacman and Frostbite). It is at the same time a general algorithm that can be applied on any sequence data. To that end, we also show that when trained on text prediction task, STRAW naturally predicts frequent n-grams (instead of macro-actions), demonstrating the generality of the approach.
Author Information
Alexander (Sasha) Vezhnevets (Google DeepMind)
Volodymyr Mnih (DeepMind)
Simon Osindero (Google DeepMind)
Alex Graves (Google DeepMind)
Main contributions to neural networks include the Connectionist Temporal Classification training algorithm (widely used for speech, handwriting and gesture recognition, e.g. by Google voice search), a type of differentiable attention for RNNs (originally for handwriting generation, now a standard tool in computer vision, machine translation and elsewhere), stochastic gradient variational inference, and Neural Turing Machines. He works at Google Deep Mind.
Oriol Vinyals (DeepMind)
Oriol Vinyals is a Research Scientist at Google. He works in deep learning with the Google Brain team. Oriol holds a Ph.D. in EECS from University of California, Berkeley, and a Masters degree from University of California, San Diego. He is a recipient of the 2011 Microsoft Research PhD Fellowship. He was an early adopter of the new deep learning wave at Berkeley, and in his thesis he focused on non-convex optimization and recurrent neural networks. At Google Brain he continues working on his areas of interest, which include artificial intelligence, with particular emphasis on machine learning, language, and vision.
John Agapiou (Google DeepMind)
koray kavukcuoglu (Google DeepMind)
More from the Same Authors
-
2021 : Wasserstein Distance Maximizing Intrinsic Control »
Ishan Durugkar · Steven Hansen · Stephen Spencer · Volodymyr Mnih · Ishan Durugkar -
2021 : Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria »
Kavya Kopparapu · Edgar Dueñez-Guzman · Jayd Matyas · Alexander Vezhnevets · John Agapiou · Kevin McKee · Richard Everett · Janusz Marecki · Joel Leibo · Thore Graepel -
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2022 Poster: Palm up: Playing in the Latent Manifold for Unsupervised Pretraining »
Hao Liu · Tom Zahavy · Volodymyr Mnih · Satinder Singh -
2022 Poster: An empirical analysis of compute-optimal large language model training »
Jordan Hoffmann · Sebastian Borgeaud · Arthur Mensch · Elena Buchatskaya · Trevor Cai · Eliza Rutherford · Diego de Las Casas · Lisa Anne Hendricks · Johannes Welbl · Aidan Clark · Thomas Hennigan · Eric Noland · Katherine Millican · George van den Driessche · Bogdan Damoc · Aurelia Guy · Simon Osindero · Karén Simonyan · Erich Elsen · Oriol Vinyals · Jack Rae · Laurent Sifre -
2022 Poster: Flamingo: a Visual Language Model for Few-Shot Learning »
Jean-Baptiste Alayrac · Jeff Donahue · Pauline Luc · Antoine Miech · Iain Barr · Yana Hasson · Karel Lenc · Arthur Mensch · Katherine Millican · Malcolm Reynolds · Roman Ring · Eliza Rutherford · Serkan Cabi · Tengda Han · Zhitao Gong · Sina Samangooei · Marianne Monteiro · Jacob L Menick · Sebastian Borgeaud · Andy Brock · Aida Nematzadeh · Sahand Sharifzadeh · Mikołaj Bińkowski · Ricardo Barreira · Oriol Vinyals · Andrew Zisserman · Karén Simonyan -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2021 : Live Q&A session: Oriol Vinyals (DeepMind) »
Oriol Vinyals -
2021 : Invited Talk: Oriol Vinyals (DeepMind) »
Oriol Vinyals -
2021 Panel: The Consequences of Massive Scaling in Machine Learning »
Noah Goodman · Melanie Mitchell · Joelle Pineau · Oriol Vinyals · Jared Kaplan -
2020 : QA: Oriol Vinyals »
Oriol Vinyals -
2020 : Invited Talk: Oriol Vinyals »
Oriol Vinyals -
2020 Poster: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Poster: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning »
Jean-Bastien Grill · Florian Strub · Florent Altché · Corentin Tallec · Pierre Richemond · Elena Buchatskaya · Carl Doersch · Bernardo Avila Pires · Daniel (Zhaohan) Guo · Mohammad Gheshlaghi Azar · Bilal Piot · koray kavukcuoglu · Remi Munos · Michal Valko -
2020 Spotlight: Pointer Graph Networks »
Petar Veličković · Lars Buesing · Matthew Overlan · Razvan Pascanu · Oriol Vinyals · Charles Blundell -
2020 Oral: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning »
Jean-Bastien Grill · Florian Strub · Florent Altché · Corentin Tallec · Pierre Richemond · Elena Buchatskaya · Carl Doersch · Bernardo Avila Pires · Daniel (Zhaohan) Guo · Mohammad Gheshlaghi Azar · Bilal Piot · koray kavukcuoglu · Remi Munos · Michal Valko -
2020 Session: Orals & Spotlights Track 28: Deep Learning »
Oriol Vinyals · Guido Montufar -
2020 : Keynote Presentation I Oriol Vinyals »
Oriol Vinyals -
2019 : Grandmaster Level in StarCraft II using Multi-Agent Reinforcement Learning - Invited Talk »
Oriol Vinyals -
2019 : The MineRL competition »
Misa Ogura · Joe Booth · Sophia Sun · Nicholay Topin · Brandon Houghton · William Guss · Stephanie Milani · Oriol Vinyals · Katja Hofmann · JIA KIM · Karolis Ramanauskas · Florian Laurent · Daichi Nishio · Anssi Kanervisto · Alexey Skrynnik · Artemij Amiranashvili · Christian Scheller · KAIXIN WANG · Yanick Schraner -
2019 Poster: Generating Diverse High-Fidelity Images with VQ-VAE-2 »
Ali Razavi · Aaron van den Oord · Oriol Vinyals -
2019 Poster: Unsupervised Learning of Object Keypoints for Perception and Control »
Tejas Kulkarni · Ankush Gupta · Catalin Ionescu · Sebastian Borgeaud · Malcolm Reynolds · Andrew Zisserman · Volodymyr Mnih -
2019 Poster: Classification Accuracy Score for Conditional Generative Models »
Suman Ravuri · Oriol Vinyals -
2018 Poster: Learning to Navigate in Cities Without a Map »
Piotr Mirowski · Matt Grimes · Mateusz Malinowski · Karl Moritz Hermann · Keith Anderson · Denis Teplyashin · Karen Simonyan · koray kavukcuoglu · Andrew Zisserman · Raia Hadsell -
2018 Poster: Relational recurrent neural networks »
Adam Santoro · Ryan Faulkner · David Raposo · Jack Rae · Mike Chrzanowski · Theophane Weber · Daan Wierstra · Oriol Vinyals · Razvan Pascanu · Timothy Lillicrap -
2017 : Meta Unsupervised Learning »
Oriol Vinyals -
2017 Workshop: Deep Learning: Bridging Theory and Practice »
Sanjeev Arora · Maithra Raghu · Russ Salakhutdinov · Ludwig Schmidt · Oriol Vinyals -
2017 : Distilling Expensive Simulations with Neural Networks »
Oriol Vinyals -
2017 Poster: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Oral: Imagination-Augmented Agents for Deep Reinforcement Learning »
Sébastien Racanière · Theophane Weber · David Reichert · Lars Buesing · Arthur Guez · Danilo Jimenez Rezende · Adrià Puigdomènech Badia · Oriol Vinyals · Nicolas Heess · Yujia Li · Razvan Pascanu · Peter Battaglia · Demis Hassabis · David Silver · Daan Wierstra -
2017 Poster: Neural Discrete Representation Learning »
Aaron van den Oord · Oriol Vinyals · koray kavukcuoglu -
2017 Tutorial: Deep Learning: Practice and Trends »
Nando de Freitas · Scott Reed · Oriol Vinyals -
2016 Symposium: Recurrent Neural Networks and Other Machines that Learn Algorithms »
Jürgen Schmidhuber · Sepp Hochreiter · Alex Graves · Rupesh K Srivastava -
2016 Poster: Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes »
Jack Rae · Jonathan J Hunt · Ivo Danihelka · Tim Harley · Andrew Senior · Gregory Wayne · Alex Graves · Timothy Lillicrap -
2016 Poster: Conditional Image Generation with PixelCNN Decoders »
Aaron van den Oord · Nal Kalchbrenner · Lasse Espeholt · koray kavukcuoglu · Oriol Vinyals · Alex Graves -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2016 Poster: Learning values across many orders of magnitude »
Hado van Hasselt · Arthur Guez · Arthur Guez · Matteo Hessel · Volodymyr Mnih · David Silver -
2016 Poster: An Online Sequence-to-Sequence Model Using Partial Conditioning »
Navdeep Jaitly · Quoc V Le · Oriol Vinyals · Ilya Sutskever · David Sussillo · Samy Bengio -
2016 Poster: Memory-Efficient Backpropagation Through Time »
Audrunas Gruslys · Remi Munos · Ivo Danihelka · Marc Lanctot · Alex Graves -
2016 Poster: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Oral: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Poster: Interaction Networks for Learning about Objects, Relations and Physics »
Peter Battaglia · Razvan Pascanu · Matthew Lai · Danilo Jimenez Rezende · koray kavukcuoglu -
2016 Poster: Matching Networks for One Shot Learning »
Oriol Vinyals · Charles Blundell · Timothy Lillicrap · koray kavukcuoglu · Daan Wierstra -
2015 : The Deep Reinforcement Learning Boom »
Volodymyr Mnih -
2015 Poster: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks »
Samy Bengio · Oriol Vinyals · Navdeep Jaitly · Noam Shazeer -
2015 Poster: Natural Neural Networks »
Guillaume Desjardins · Karen Simonyan · Razvan Pascanu · koray kavukcuoglu -
2015 Poster: Spatial Transformer Networks »
Max Jaderberg · Karen Simonyan · Andrew Zisserman · koray kavukcuoglu -
2015 Poster: Pointer Networks »
Oriol Vinyals · Meire Fortunato · Navdeep Jaitly -
2015 Spotlight: Pointer Networks »
Oriol Vinyals · Meire Fortunato · Navdeep Jaitly -
2015 Spotlight: Spatial Transformer Networks »
Max Jaderberg · Karen Simonyan · Andrew Zisserman · koray kavukcuoglu -
2015 Poster: Grammar as a Foreign Language »
Oriol Vinyals · Łukasz Kaiser · Terry Koo · Slav Petrov · Ilya Sutskever · Geoffrey Hinton -
2015 Tutorial: Large-Scale Distributed Systems for Training Neural Networks »
Jeff Dean · Oriol Vinyals -
2014 Workshop: Deep Learning and Representation Learning »
Andrew Y Ng · Yoshua Bengio · Adam Coates · Roland Memisevic · Sharanyan Chetlur · Geoffrey E Hinton · Shamim Nemati · Bryan Catanzaro · Surya Ganguli · Herbert Jaeger · Phil Blunsom · Leon Bottou · Volodymyr Mnih · Chen-Yu Lee · Rich M Schwartz -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2013 Workshop: Deep Learning »
Yoshua Bengio · Hugo Larochelle · Russ Salakhutdinov · Tomas Mikolov · Matthew D Zeiler · David Mcallester · Nando de Freitas · Josh Tenenbaum · Jian Zhou · Volodymyr Mnih -
2012 Poster: Learning the Dependency Structure of Latent Factors »
Yunlong He · Yanjun Qi · koray kavukcuoglu · Haesun Park -
2011 Poster: Practical Variational Inference for Neural Networks »
Alex Graves -
2011 Spotlight: Practical Variational Inference for Neural Networks »
Alex Graves -
2010 Poster: Generating more realistic images using gated MRF's »
Marc'Aurelio Ranzato · Volodymyr Mnih · Geoffrey E Hinton -
2010 Spotlight: Learning Convolutional Feature Hierarchies for Visual Recognition »
koray kavukcuoglu · Pierre Sermanet · Y-Lan Boureau · Karol Gregor · Michael Mathieu · Yann LeCun -
2010 Poster: Learning Convolutional Feature Hierarchies for Visual Recognition »
koray kavukcuoglu · Pierre Sermanet · Y-Lan Boureau · Karol Gregor · Michael Mathieu · Yann LeCun -
2008 Poster: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks »
Alex Graves · Jürgen Schmidhuber -
2008 Spotlight: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks »
Alex Graves · Jürgen Schmidhuber -
2007 Poster: Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks »
Alex Graves · Santiago Fernandez · Marcus Liwicki · Horst Bunke · Jürgen Schmidhuber -
2007 Poster: Modeling image patches with a directed hierarchy of Markov random fields »
Simon Osindero · Geoffrey E Hinton