Timezone: »
When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, we present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language). Using aligned image and caption data, we train a vision encoder to represent each image as a sequence of continuous embeddings, such that a pre-trained, frozen language model presented with this prefix generates the appropriate caption. The resulting system is a multimodal few-shot learner, with the surprising ability to learn a variety of new tasks when conditioned on examples, represented as a sequence of any number of interleaved image and text embeddings. We demonstrate that it can rapidly learn words for new objects and novel visual categories, do visual question-answering with only a handful of examples, and make use of outside knowledge, by measuring a single model on a variety of established and new benchmarks.
Author Information
Maria Tsimpoukelli (DeepMind)
Jacob L Menick (Google DeepMind)
Serkan Cabi (DeepMind)
S. M. Ali Eslami (DeepMind)
Oriol Vinyals (Google DeepMind)
Felix Hill (Deepmind)
More from the Same Authors
-
2021 : Inferring a Continuous Distribution of Atom Coordinates from Cryo-EM Images using VAEs »
Dan Rosenbaum · Marta Garnelo · Michal Zielinski · Charles Beattie · Ellen Clancy · Andrea Huber · Pushmeet Kohli · Andrew Senior · John Jumper · Carl Doersch · S. M. Ali Eslami · Olaf Ronneberger · Jonas Adler -
2021 : StarCraft II Unplugged: Large Scale Offline Reinforcement Learning »
Michael Mathieu · Sherjil Ozair · Srivatsan Srinivasan · Caglar Gulcehre · Shangtong Zhang · Ray Jiang · Tom Paine · Konrad Żołna · Julian Schrittwieser · David Choi · Petko I Georgiev · Daniel Toyama · Roman Ring · Igor Babuschkin · Timo Ewalds · · Aaron van den Oord · Wojciech Czarnecki · Nando de Freitas · Oriol Vinyals -
2021 : Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning »
Wilka Carvalho · Andrew Lampinen · Kyriacos Nikiforou · Felix Hill · Murray Shanahan -
2022 : Transformers generalize differently from information stored in context vs in weights »
Stephanie Chan · Ishita Dasgupta · Junkyung Kim · Dharshan Kumaran · Andrew Lampinen · Felix Hill -
2022 : Collaborating with language models for embodied reasoning »
Ishita Dasgupta · Christine Kaeser-Chen · Kenneth Marino · Arun Ahuja · Sheila Babayan · Felix Hill · Rob Fergus -
2022 : Collaborating with language models for embodied reasoning »
Ishita Dasgupta · Christine Kaeser-Chen · Kenneth Marino · Arun Ahuja · Sheila Babayan · Felix Hill · Rob Fergus -
2023 Poster: Self-supervised video pretraining yields human-aligned visual representations »
Nikhil Parthasarathy · S. M. Ali Eslami · Joao Carreira · Olivier Henaff -
2023 Poster: The Transient Nature of Emergent In-context Learning in Transformers »
Aaditya Singh · Stephanie Chan · Ted Moskovitz · Erin Grant · Andrew Saxe · Felix Hill -
2023 Affinity Workshop: Muslims in ML »
Sanae Lotfi · Hammaad Adam · Marzyeh Ghassemi · Shakir Mohamed · S. M. Ali Eslami -
2022 : Meaning without reference in large language models »
Steven Piantadosi · Felix Hill -
2022 Poster: Data Distributional Properties Drive Emergent In-Context Learning in Transformers »
Stephanie Chan · Adam Santoro · Andrew Lampinen · Jane Wang · Aaditya Singh · Pierre Richemond · James McClelland · Felix Hill -
2022 Poster: Semantic Exploration from Language Abstractions and Pretrained Representations »
Allison Tam · Neil Rabinowitz · Andrew Lampinen · Nicholas Roy · Stephanie Chan · DJ Strouse · Jane Wang · Andrea Banino · Felix Hill -
2022 Poster: Flamingo: a Visual Language Model for Few-Shot Learning »
Jean-Baptiste Alayrac · Jeff Donahue · Pauline Luc · Antoine Miech · Iain Barr · Yana Hasson · Karel Lenc · Arthur Mensch · Katherine Millican · Malcolm Reynolds · Roman Ring · Eliza Rutherford · Serkan Cabi · Tengda Han · Zhitao Gong · Sina Samangooei · Marianne Monteiro · Jacob L Menick · Sebastian Borgeaud · Andy Brock · Aida Nematzadeh · Sahand Sharifzadeh · Mikołaj Bińkowski · Ricardo Barreira · Oriol Vinyals · Andrew Zisserman · Karén Simonyan -
2021 : Inferring a Continuous Distribution of Atom Coordinates from Cryo-EM Images using VAEs »
Dan Rosenbaum · Marta Garnelo · Michal Zielinski · Charles Beattie · Ellen Clancy · Andrea Huber · Pushmeet Kohli · Andrew Senior · John Jumper · Carl Doersch · S. M. Ali Eslami · Olaf Ronneberger · Jonas Adler -
2021 Poster: Attention over Learned Object Embeddings Enables Complex Visual Reasoning »
David Ding · Felix Hill · Adam Santoro · Malcolm Reynolds · Matt Botvinick -
2021 Poster: Towards mental time travel: a hierarchical memory for reinforcement learning agents »
Andrew Lampinen · Stephanie Chan · Andrea Banino · Felix Hill -
2021 Oral: Attention over Learned Object Embeddings Enables Complex Visual Reasoning »
David Ding · Felix Hill · Adam Santoro · Malcolm Reynolds · Matt Botvinick -
2018 : Poster Session 1 + Coffee »
Tom Van de Wiele · Rui Zhao · J. Fernando Hernandez-Garcia · Fabio Pardo · Xian Yeow Lee · Xiaolin Andy Li · Marcin Andrychowicz · Jie Tang · Suraj Nair · Juhyeon Lee · Cédric Colas · S. M. Ali Eslami · Yen-Chen Wu · Stephen McAleer · Ryan Julian · Yang Xue · Matthia Sabatelli · Pranav Shyam · Alexandros Kalousis · Giovanni Montana · Emanuele Pesce · Felix Leibfried · Zhanpeng He · Chunxiao Liu · Yanjun Li · Yoshihide Sawada · Alexander Pashevich · Tejas Kulkarni · Keiran Paster · Luca Rigazio · Quan Vuong · Hyunggon Park · Minhae Kwon · Rivindu Weerasekera · Shamane Siriwardhanaa · Rui Wang · Ozsel Kilinc · Keith Ross · Yizhou Wang · Simon Schmitt · Thomas Anthony · Evan Cater · Forest Agostinelli · Tegg Sung · Shirou Maruyama · Alexander Shmakov · Devin Schwab · Mohammad Firouzi · Glen Berseth · Denis Osipychev · Jesse Farebrother · Jianlan Luo · William Agnew · Peter Vrancx · Jonathan Heek · Catalin Ionescu · Haiyan Yin · Megumi Miyashita · Nathan Jay · Noga H. Rotman · Sam Leroux · Shaileshh Bojja Venkatakrishnan · Henri Schmidt · Jack Terwilliger · Ishan Durugkar · Jonathan Sauder · David Kas · Arash Tavakoli · Alain-Sam Cohen · Philip Bontrager · Adam Lerer · Thomas Paine · Ahmed Khalifa · Ruben Rodriguez · Avi Singh · Yiming Zhang -
2018 Poster: A Probabilistic U-Net for Segmentation of Ambiguous Images »
Simon Kohl · Bernardino Romera-Paredes · Clemens Meyer · Jeffrey De Fauw · Joseph R. Ledsam · Klaus Maier-Hein · S. M. Ali Eslami · Danilo Jimenez Rezende · Olaf Ronneberger -
2018 Poster: Neural Arithmetic Logic Units »
Andrew Trask · Felix Hill · Scott Reed · Jack Rae · Chris Dyer · Phil Blunsom -
2018 Spotlight: A Probabilistic U-Net for Segmentation of Ambiguous Images »
Simon Kohl · Bernardino Romera-Paredes · Clemens Meyer · Jeffrey De Fauw · Joseph R. Ledsam · Klaus Maier-Hein · S. M. Ali Eslami · Danilo Jimenez Rezende · Olaf Ronneberger -
2017 : Panel Discussion »
Felix Hill · Olivier Pietquin · Jack Gallant · Raymond Mooney · Sanja Fidler · Chen Yu · Devi Parikh -
2017 : Grounded Language Learning in a Simulated 3D World »
Felix Hill -
2017 Workshop: Machine Learning for Creativity and Design »
Douglas Eck · David Ha · S. M. Ali Eslami · Sander Dieleman · Rebecca Fiebrink · Luba Elliott -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Poster: Attend, Infer, Repeat: Fast Scene Understanding with Generative Models »
S. M. Ali Eslami · Nicolas Heess · Theophane Weber · Yuval Tassa · David Szepesvari · koray kavukcuoglu · Geoffrey E Hinton -
2015 Workshop: Black box learning and inference »
Josh Tenenbaum · Jan-Willem van de Meent · Tejas Kulkarni · S. M. Ali Eslami · Brooks Paige · Frank Wood · Zoubin Ghahramani -
2014 Poster: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2014 Oral: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2012 Poster: Learning with Recursive Perceptual Representations »
Oriol Vinyals · Yangqing Jia · Li Deng · Trevor Darrell