Timezone: »
Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse. One common solution is to use intrinsic rewards to encourage agents to explore their environment. However, recent intrinsic exploration methods often use state-based novelty measures which reward low-level exploration and may not scale to domains requiring more abstract skills. Instead, we explore natural language as a general medium for highlighting relevant abstractions in an environment. Unlike previous work, we evaluate whether language can improve over existing exploration methods by directly extending (and comparing to) competitive intrinsic exploration baselines: AMIGo (Campero et al., 2021) and NovelD (Zhang et al., 2021). These language-based variants outperform their non-linguistic forms by 47-85% across 13 challenging tasks from the MiniGrid and MiniHack environment suites.
Author Information
Jesse Mu (Stanford University)
Victor Zhong (University of Washington)
Roberta Raileanu (FAIR)
Minqi Jiang (UCL & FAIR)
Noah Goodman (Stanford University)
Tim Rocktäschel (University College London, Facebook AI Research)
Tim is a Researcher at Facebook AI Research (FAIR) London, an Associate Professor at the Centre for Artificial Intelligence in the Department of Computer Science at University College London (UCL), and a Scholar of the European Laboratory for Learning and Intelligent Systems (ELLIS). Prior to that, he was a Postdoctoral Researcher in Reinforcement Learning at the University of Oxford, a Junior Research Fellow in Computer Science at Jesus College, and a Stipendiary Lecturer in Computer Science at Hertford College. Tim obtained his Ph.D. from UCL under the supervision of Sebastian Riedel, and he was awarded a Microsoft Research Ph.D. Scholarship in 2013 and a Google Ph.D. Fellowship in 2017. His work focuses on reinforcement learning in open-ended environments that require intrinsically motivated agents capable of transferring commonsense, world and domain knowledge in order to systematically generalize to novel situations.
Edward Grefenstette (Cohere & University College London)
More from the Same Authors
-
2021 : MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research »
Mikayel Samvelyan · Robert Kirk · Vitaly Kurin · Jack Parker-Holder · Minqi Jiang · Eric Hambro · Fabio Petroni · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel -
2021 : DABS: a Domain-Agnostic Benchmark for Self-Supervised Learning »
Alex Tamkin · Vincent Liu · Rongfei Lu · Daniel Fein · Colin Schultz · Noah Goodman -
2021 : Grounding Aleatoric Uncertainty in Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Kuttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2021 : That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 : Graph Backup: Data Efficient Backup Exploiting Markovian Data »
zhengyao Jiang · Tianjun Zhang · Robert Kirk · Tim Rocktäschel · Edward Grefenstette -
2021 : Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay »
Iryna Korshunova · Minqi Jiang · Jack Parker-Holder · Tim Rocktäschel · Edward Grefenstette -
2021 : Learning to solve complex tasks by growing knowledge culturally across generations »
Michael Tessler · Jason Madeano · Pedro Tsividis · Noah Goodman · Josh Tenenbaum -
2022 : Lemma: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions »
Zhening Li · Gabriel Poesia Reis e Silva · Omar Costilla Reyes · Noah Goodman · Armando Solar-Lezama -
2022 : Efficient Planning in a Compact Latent Action Space »
zhengyao Jiang · Tianjun Zhang · Michael Janner · Yueying (Lisa) Li · Tim Rocktäschel · Edward Grefenstette · Yuandong Tian -
2022 : Optimal Transport for Offline Imitation Learning »
Yicheng Luo · zhengyao Jiang · Samuel Cohen · Edward Grefenstette · Marc Deisenroth -
2022 : On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning »
Dilip Arumugam · Mark Ho · Noah Goodman · Benjamin Van Roy -
2022 : Integrating Episodic and Global Bonuses for Efficient Exploration »
Mikael Henaff · Minqi Jiang · Roberta Raileanu -
2022 : In the ZONE: Measuring difficulty and progression in curriculum generation »
Rose Wang · Jesse Mu · Dilip Arumugam · Natasha Jaques · Noah Goodman -
2022 : Building a Subspace of Policies for Scalable Continual Learning »
Jean-Baptiste Gaya · Thang Long Doan · Lucas Page-Caccia · Laure Soulier · Ludovic Denoyer · Roberta Raileanu -
2022 : Uncertainty-Driven Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2022 : MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning »
Mikayel Samvelyan · Akbir Khan · Michael Dennis · Minqi Jiang · Jack Parker-Holder · Jakob Foerster · Roberta Raileanu · Tim Rocktäschel -
2023 Poster: On the Importance of Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2023 Poster: Why think step by step? Reasoning emerges from the locality of experience »
Benjamin Prystawski · Michael Li · Noah Goodman -
2023 Poster: Parsel🐍: Algorithmic Reasoning with Language Models by Composing Decompositions »
Eric Zelikman · Qian Huang · Gabriel Poesia · Noah Goodman · Nick Haber -
2023 Poster: Interpretability at Scale: Identifying Causal Mechanisms in Alpaca »
Zhengxuan Wu · Atticus Geiger · Christopher Potts · Noah Goodman -
2023 Poster: Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design »
Matthew T Jackson · Minqi Jiang · Jack Parker-Holder · Risto Vuorio · Chris Lu · Greg Farquhar · Shimon Whiteson · Jakob Foerster -
2023 Poster: Improving Language Plasticity via Pretraining with Active Forgetting »
Yihong Chen · Mikel Artetxe · Kelly Marchisio · Roberta Raileanu · David Adelani · Pontus Lars Erik Saito Stenetorp · Sebastian Riedel -
2023 Poster: Toolformer: Language Models Can Teach Themselves to Use Tools »
Timo Schick · Jane Dwivedi-Yu · Roberto Dessi · Roberta Raileanu · Maria Lomeli · Eric Hambro · Luke Zettlemoyer · Nicola Cancedda · Thomas Scialom -
2023 Poster: Feature Dropout: Revisiting the Role of Augmentations in Contrastive Learning »
Alex Tamkin · Margalit Glasgow · Xiluo He · Noah Goodman -
2023 Poster: The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs »
Laura Ruis · Akbir Khan · Stella Biderman · Sara Hooker · Tim Rocktäschel · Edward Grefenstette -
2023 Poster: Learning to Compress Prompts with Gist Tokens »
Jesse Mu · Xiang Li · Noah Goodman -
2023 Poster: Understanding Social Reasoning in Language Models with Language Models »
Kanishk Gandhi · Jan-Philipp Franken · Tobias Gerstenberg · Noah Goodman -
2023 Oral: Why think step by step? Reasoning emerges from the locality of experience »
Benjamin Prystawski · Michael Li · Noah Goodman -
2023 Oral: Toolformer: Language Models Can Teach Themselves to Use Tools »
Timo Schick · Jane Dwivedi-Yu · Roberto Dessi · Roberta Raileanu · Maria Lomeli · Eric Hambro · Luke Zettlemoyer · Nicola Cancedda · Thomas Scialom -
2023 Workshop: Socially Responsible Language Modelling Research (SoLaR) »
Usman Anwar · David Krueger · Samuel Bowman · Jakob Foerster · Su Lin Blodgett · Roberta Raileanu · Alan Chan · Katherine Lee · Laura Ruis · Robert Kirk · Yawen Duan · Xin Chen · Kawin Ethayarajh -
2023 Workshop: Agent Learning in Open-Endedness Workshop »
Minqi Jiang · Mikayel Samvelyan · Jack Parker-Holder · Mayalen Etcheverry · Yingchen Xu · Michael Dennis · Roberta Raileanu -
2022 : MATH-AI: Toward Human-Level Mathematical Reasoning »
Francois Charton · Noah Goodman · Behnam Neyshabur · Talia Ringer · Daniel Selsam -
2022 : Learning Mathematical Reasoning for Education »
Noah Goodman -
2022 : Invited Talk: Noah Goodman »
Noah Goodman -
2022 Workshop: LaReL: Language and Reinforcement Learning »
Laetitia Teodorescu · Laura Ruis · Tristan Karch · Cédric Colas · Paul Barde · Jelena Luketina · Athul Jacob · Pratyusha Sharma · Edward Grefenstette · Jacob Andreas · Marc-Alexandre Côté -
2022 Poster: Assistive Teaching of Motor Control Tasks to Humans »
Megha Srivastava · Erdem Biyik · Suvir Mirchandani · Noah Goodman · Dorsa Sadigh -
2022 Poster: CLEVRER-Humans: Describing Physical and Causal Events the Human Way »
Jiayuan Mao · Xuelin Yang · Xikun Zhang · Noah Goodman · Jiajun Wu -
2022 Poster: Dungeons and Data: A Large-Scale NetHack Dataset »
Eric Hambro · Roberta Raileanu · Danielle Rothermel · Vegard Mella · Tim Rocktäschel · Heinrich Küttler · Naila Murray -
2022 Poster: Learning General World Models in a Handful of Reward-Free Deployments »
Yingchen Xu · Jack Parker-Holder · Aldo Pacchiano · Philip Ball · Oleh Rybkin · S Roberts · Tim Rocktäschel · Edward Grefenstette -
2022 Poster: Geoclidean: Few-Shot Generalization in Euclidean Geometry »
Joy Hsu · Jiajun Wu · Noah Goodman -
2022 Poster: Active Learning Helps Pretrained Models Learn the Intended Task »
Alex Tamkin · Dat Nguyen · Salil Deshpande · Jesse Mu · Noah Goodman -
2022 Poster: Foundation Posteriors for Approximate Probabilistic Inference »
Mike Wu · Noah Goodman -
2022 Poster: Grounding Aleatoric Uncertainty for Unsupervised Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Andrei Lupu · Heinrich Küttler · Edward Grefenstette · Tim Rocktäschel · Jakob Foerster -
2022 Poster: STaR: Bootstrapping Reasoning With Reasoning »
Eric Zelikman · Yuhuai Wu · Jesse Mu · Noah Goodman -
2022 Poster: Improving Policy Learning via Language Dynamics Distillation »
Victor Zhong · Jesse Mu · Luke Zettlemoyer · Edward Grefenstette · Tim Rocktäschel -
2022 Poster: Exploration via Elliptical Episodic Bonuses »
Mikael Henaff · Roberta Raileanu · Minqi Jiang · Tim Rocktäschel -
2022 Poster: GriddlyJS: A Web IDE for Reinforcement Learning »
Christopher Bamford · Minqi Jiang · Mikayel Samvelyan · Tim Rocktäschel -
2022 Poster: DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision »
Alex Tamkin · Gaurab Banerjee · Mohamed Owda · Vincent Liu · Shashank Rammoorthy · Noah Goodman -
2021 : Spotlight Talk: Learning to solve complex tasks by growing knowledge culturally across generations »
Noah Goodman · Josh Tenenbaum · Michael Tessler · Jason Madeano -
2021 : Multi-party referential communication in complex strategic games »
Jessica Mankewitz · Veronica Boyce · Brandon Waldon · Georgia Loukatou · Dhara Yu · Jesse Mu · Noah Goodman · Michael C Frank -
2021 Workshop: Meaning in Context: Pragmatic Communication in Humans and Machines »
Jennifer Hu · Noga Zaslavsky · Aida Nematzadeh · Michael Franke · Roger Levy · Noah Goodman -
2021 : Opening remarks »
Jennifer Hu · Noga Zaslavsky · Aida Nematzadeh · Michael Franke · Roger Levy · Noah Goodman -
2021 Poster: Emergent Communication of Generalizations »
Jesse Mu · Noah Goodman -
2021 : The NetHack Challenge + Q&A »
Eric Hambro · Sharada Mohanty · Dipam Chakrabroty · Edward Grefenstette · Minqi Jiang · Robert Kirk · Vitaly Kurin · Heinrich Kuttler · Vegard Mella · Nantas Nardelli · Jack Parker-Holder · Roberta Raileanu · Tim Rocktäschel · Danielle Rothermel · Mikayel Samvelyan -
2021 Poster: Replay-Guided Adversarial Environment Design »
Minqi Jiang · Michael Dennis · Jack Parker-Holder · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2021 Poster: Contrastive Reinforcement Learning of Symbolic Reasoning Domains »
Gabriel Poesia · WenXin Dong · Noah Goodman -
2021 Poster: Improving Compositionality of Neural Networks by Decoding Representations to Inputs »
Mike Wu · Noah Goodman · Stefano Ermon -
2021 Poster: SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark »
Victor Zhong · Austin W. Hanjie · Sida Wang · Karthik Narasimhan · Luke Zettlemoyer -
2021 Panel: The Consequences of Massive Scaling in Machine Learning »
Noah Goodman · Melanie Mitchell · Joelle Pineau · Oriol Vinyals · Jared Kaplan -
2020 Poster: The NetHack Learning Environment »
Heinrich Küttler · Nantas Nardelli · Alexander Miller · Roberta Raileanu · Marco Selvatici · Edward Grefenstette · Tim Rocktäschel -
2020 Poster: Compositional Explanations of Neurons »
Jesse Mu · Jacob Andreas -
2020 Oral: Compositional Explanations of Neurons »
Jesse Mu · Jacob Andreas -
2020 Poster: Language Through a Prism: A Spectral Approach for Multiscale Language Representations »
Alex Tamkin · Dan Jurafsky · Noah Goodman -
2019 Poster: Variational Bayesian Optimal Experimental Design »
Adam Foster · Martin Jankowiak · Elias Bingham · Paul Horsfall · Yee Whye Teh · Thomas Rainforth · Noah Goodman -
2019 Spotlight: Variational Bayesian Optimal Experimental Design »
Adam Foster · Martin Jankowiak · Elias Bingham · Paul Horsfall · Yee Whye Teh · Thomas Rainforth · Noah Goodman -
2018 : Poster Sessions and Lunch (Provided) »
Akira Utsumi · Alane Suhr · Ji Zhang · Ramon Sanabria · Kushal Kafle · Nicholas Chen · Seung Wook Kim · Aishwarya Agrawal · SRI HARSHA DUMPALA · Shikhar Murty · Pablo Azagra · Jean ROUAT · Alaaeldin Ali · · SUBBAREDDY OOTA · Angela Lin · Shruti Palaskar · Farley Lai · Amir Aly · Tingke Shen · Dianqi Li · Jianguo Zhang · Rita Kuznetsova · Jinwon An · Jean-Benoit Delbrouck · Tomasz Kornuta · Syed Ashar Javed · Christopher Davis · John Co-Reyes · Vasu Sharma · Sungwon Lyu · Ning Xie · Ankita Kalra · Huan Ling · Oleksandr Maksymets · Bhavana Mahendra Jain · Shun-Po Chuang · Sanyam Agarwal · Jerome Abdelnour · Yufei Feng · vincent albouy · Siddharth Karamcheti · Derek Doran · Roberta Raileanu · Jonathan Heek -
2018 Poster: e-SNLI: Natural Language Inference with Natural Language Explanations »
Oana-Maria Camburu · Tim Rocktäschel · Thomas Lukasiewicz · Phil Blunsom -
2018 Poster: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2018 Spotlight: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2018 Poster: Multimodal Generative Models for Scalable Weakly-Supervised Learning »
Mike Wu · Noah Goodman -
2017 : Contributed Talks 2 »
Roberta Raileanu · Satwik Kottur · Paul Grouchy -
2017 : Morning panel discussion »
Jürgen Schmidhuber · Noah Goodman · Anca Dragan · Pushmeet Kohli · Dhruv Batra -
2017 : "Language in context" »
Noah Goodman -
2017 Workshop: 6th Workshop on Automated Knowledge Base Construction (AKBC) »
Jay Pujara · Dor Arad · Bhavana Dalvi Mishra · Tim Rocktäschel -
2017 Poster: End-to-End Differentiable Proving »
Tim Rocktäschel · Sebastian Riedel -
2017 Oral: End-to-end Differentiable Proving »
Tim Rocktäschel · Sebastian Riedel -
2017 Poster: Learning Disentangled Representations with Semi-Supervised Deep Generative Models »
Siddharth Narayanaswamy · Brooks Paige · Jan-Willem van de Meent · Alban Desmaison · Noah Goodman · Pushmeet Kohli · Frank Wood · Philip Torr -
2016 Workshop: Neural Abstract Machines & Program Induction »
Matko Bošnjak · Nando de Freitas · Tejas Kulkarni · Arvind Neelakantan · Scott E Reed · Sebastian Riedel · Tim Rocktäschel -
2016 Poster: Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks »
Daniel Ritchie · Anna Thomas · Pat Hanrahan · Noah Goodman -
2015 Workshop: Bounded Optimality and Rational Metareasoning »
Samuel J Gershman · Falk Lieder · Tom Griffiths · Noah Goodman -
2015 Poster: Teaching Machines to Read and Comprehend »
Karl Moritz Hermann · Tomas Kocisky · Edward Grefenstette · Lasse Espeholt · Will Kay · Mustafa Suleyman · Phil Blunsom -
2015 Poster: Learning to Transduce with Unbounded Memory »
Edward Grefenstette · Karl Moritz Hermann · Mustafa Suleyman · Phil Blunsom -
2013 Poster: Learning and using language via recursive pragmatic reasoning about other agents »
Nathaniel J Smith · Noah Goodman · Michael C Frank -
2013 Poster: Learning Stochastic Inverses »
Andreas Stuhlmüller · Jacob Taylor · Noah Goodman -
2012 Workshop: Probabilistic Programming: Foundations and Applications (2 day) »
Vikash Mansinghka · Daniel Roy · Noah Goodman -
2012 Workshop: Probabilistic Programming: Foundations and Applications (2 day) »
Vikash Mansinghka · Daniel Roy · Noah Goodman -
2012 Poster: Burn-in, bias, and the rationality of anchoring »
Falk Lieder · Tom Griffiths · Noah Goodman -
2011 Poster: Nonstandard Interpretations of Probabilistic Programs for Efficient Inference »
David Wingate · Noah Goodman · Andreas Stuhlmueller · Jeffrey Siskind