Timezone: »
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.
Author Information
Micah Carroll (UC Berkeley)
Orr Paradise (University of California, Berkeley)
Jessy Lin (University of California Berkeley)
Raluca Georgescu (Microsoft)
Mingfei Sun (Microsoft Research)
David Bignell (Research, Microsoft)
Stephanie Milani (Carnegie Mellon University)
Katja Hofmann (Microsoft Research)
Dr. Katja Hofmann is a Principal Researcher at the [Game Intelligence](http://aka.ms/gameintelligence/) group at [Microsoft Research Cambridge, UK](https://www.microsoft.com/en-us/research/lab/microsoft-research-cambridge/). There, she leads a research team that focuses on reinforcement learning with applications in modern video games. She and her team strongly believe that modern video games will drive a transformation of how we interact with AI technology. One of the projects developed by her team is [Project Malmo](https://www.microsoft.com/en-us/research/project/project-malmo/), which uses the popular game Minecraft as an experimentation platform for developing intelligent technology. Katja's long-term goal is to develop AI systems that learn to collaborate with people, to empower their users and help solve complex real-world problems. Before joining Microsoft Research, Katja completed her PhD in Computer Science as part of the [ILPS](https://ilps.science.uva.nl/) group at the [University of Amsterdam](https://www.uva.nl/en). She worked with Maarten de Rijke and Shimon Whiteson on interactive machine learning algorithms for search engines.
Matthew Hausknecht (Microsoft Research)
Anca Dragan (UC Berkeley)
Sam Devlin (Microsoft Research)
More from the Same Authors
-
2021 : B-Pref: Benchmarking Preference-Based Reinforcement Learning »
Kimin Lee · Laura Smith · Anca Dragan · Pieter Abbeel -
2021 Spotlight: Pragmatic Image Compression for Human-in-the-Loop Decision-Making »
Sid Reddy · Anca Dragan · Sergey Levine -
2022 : A Theory of Unsupervised Translation for Understanding Animal Communication »
Shafi Goldwasser · David Gruber · Adam Tauman Kalai · Orr Paradise -
2022 : Contextual Squeeze-and-Excitation »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 : Imitating Human Behaviour with Diffusion Models »
Tim Pearce · Tabish Rashid · Anssi Kanervisto · David Bignell · Mingfei Sun · Raluca Georgescu · Sergio Valcarcel Macua · Shan Zheng Tan · Ida Momennejad · Katja Hofmann · Sam Devlin -
2022 : Time-Efficient Reward Learning via Visually Assisted Cluster Ranking »
David Zhang · Micah Carroll · Andreea Bobu · Anca Dragan -
2022 : Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration »
Mesut Yang · Micah Carroll · Anca Dragan -
2022 : Adversarial poisoning attacks on reinforcement learning-driven energy pricing »
Sam Gunn · Doseok Jang · Orr Paradise · Lucas Spangher · Costas J Spanos -
2022 : Aligning Robot Representations with Humans »
Andreea Bobu · Andi Peng · Pulkit Agrawal · Julie A Shah · Anca Dragan -
2022 Workshop: 5th Robot Learning Workshop: Trustworthy Robotics »
Alex Bewley · Roberto Calandra · Anca Dragan · Igor Gilitschenski · Emily Hannigan · Masha Itkina · Hamidreza Kasaei · Jens Kober · Danica Kragic · Nathan Lambert · Julien PEREZ · Fabio Ramos · Ransalu Senanayake · Jonathan Tompson · Vincent Vanhoucke · Markus Wulfmeier -
2022 Panel: Panel 5A-4: Uni[MASK]: Unified Inference… & Model-Based Offline Reinforcement… »
Kaiyang Guo · Micah Carroll -
2022 : Anca Dragan: Learning human preferences from language »
Anca Dragan -
2022 : A Theory of Unsupervised Translation for Understanding Animal Communication »
Shafi Goldwasser · David Gruber · Adam Kalai · Orr Paradise -
2022 Poster: First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization »
Siddharth Reddy · Sergey Levine · Anca Dragan -
2022 Poster: MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control »
Nolan Wagener · Andrey Kolobov · Felipe Vieira Frujeri · Ricky Loynd · Ching-An Cheng · Matthew Hausknecht -
2022 Poster: Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2021 : Panel II: Machine decisions »
Anca Dragan · Karen Levy · Himabindu Lakkaraju · Ariel Rosenfeld · Maithra Raghu · Irene Y Chen -
2021 : Towards RL applications in video games and with human users »
Katja Hofmann -
2021 : Methods:: Understanding Human-like Behavior in Video Game Navigation »
Evelyn Zuniga · Stephanie Milani · Katja Hofmann -
2021 : IGLU: Interactive Grounded Language Understanding in a Collaborative Environment + Q&A »
Julia Kiseleva · Ziming Li · Mohammad Aliannejadi · Maartje Anne ter Hoeve · Mikhail Burtsev · Alexey Skrynnik · Artem Zholus · Aleksandr Panov · Katja Hofmann · Kavya Srinet · arthur szlam · Michel Galley · Ahmed Awadallah -
2021 : BASALT: A MineRL Competition on Solving Human-Judged Task + Q&A »
Rohin Shah · Cody Wild · Steven Wang · Neel Alex · Brandon Houghton · William Guss · Sharada Mohanty · Stephanie Milani · Nicholay Topin · Pieter Abbeel · Stuart Russell · Anca Dragan -
2021 Poster: Pragmatic Image Compression for Human-in-the-Loop Decision-Making »
Sid Reddy · Anca Dragan · Sergey Levine -
2021 : Diamond: A MineRL Competition on Training Sample-Efficient Agents + Q&A »
William Guss · Alara Dirik · Byron Galbraith · Brandon Houghton · Anssi Kanervisto · Noboru Kuno · Stephanie Milani · Sharada Mohanty · Karolis Ramanauskas · Ruslan Salakhutdinov · Rohin Shah · Nicholay Topin · Steven Wang · Cody Wild -
2021 Poster: Grounding Spatio-Temporal Language with Transformers »
Tristan Karch · Laetitia Teodorescu · Katja Hofmann · Clément Moulin-Frier · Pierre-Yves Oudeyer -
2021 Poster: Memory Efficient Meta-Learning with Large Images »
John Bronskill · Daniela Massiceti · Massimiliano Patacchiola · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2020 : Keynote: Anca Dragan »
Anca Dragan -
2020 : Mini-panel discussion 3 - Prioritizing Real World RL Challenges »
Chelsea Finn · Thomas Dietterich · Angela Schoellig · Anca Dragan · Anusha Nagabandi · Doina Precup -
2020 : Introduction and results of the 2020 MineRL Competition »
William Guss · Stephanie Milani · Nicholay Topin -
2020 Workshop: Competition Track Saturday »
Hugo Jair Escalante · Katja Hofmann -
2020 : NeurIPS RL Competitions: MineRL »
William Guss · Stephanie Milani -
2020 : Panel 2: Tensions & Cultivating Resistance AI »
Seeta P Gangadharan · Agata Foryciarz · Mariella Saba · Hamid Khan · Biju Mathew · Vidushi Marda · Micah Carroll -
2020 Workshop: Competition Track Friday »
Hugo Jair Escalante · Katja Hofmann -
2020 : Opening - Competition Track Session »
Katja Hofmann · Hugo Jair Escalante -
2020 Workshop: Resistance AI Workshop »
Suzanne Kite · Mattie Tesfaldet · J Khadijah Abdurahman · William Agnew · Elliot Creager · Agata Foryciarz · Raphael Gontijo Lopes · Pratyusha Kalluri · Marie-Therese Png · Manuel Sabin · Maria Skoularidou · Ramon Vilarino · Rose Wang · Sayash Kapoor · Micah Carroll -
2020 : Q&A for invited speaker, Anca Dragan »
Anca Dragan -
2020 : Getting human-robot interaction strategies to emerge from first principles »
Anca Dragan -
2020 Poster: AvE: Assistance via Empowerment »
Yuqing Du · Stas Tiomkin · Emre Kiciman · Daniel Polani · Pieter Abbeel · Anca Dragan -
2020 Poster: Reward-rational (implicit) choice: A unifying formalism for reward learning »
Hong Jun Jeon · Smitha Milli · Anca Dragan -
2020 Poster: Preference learning along multiple criteria: A game-theoretic perspective »
Kush Bhatia · Ashwin Pananjady · Peter Bartlett · Anca Dragan · Martin Wainwright -
2020 : Discussion Panel: Hugo Larochelle, Finale Doshi-Velez, Devi Parikh, Marc Deisenroth, Julien Mairal, Katja Hofmann, Phillip Isola, and Michael Bowling »
Hugo Larochelle · Finale Doshi-Velez · Marc Deisenroth · Devi Parikh · Julien Mairal · Katja Hofmann · Phillip Isola · Michael Bowling -
2019 : Multi-Task Reinforcement Learning and Generalization »
Katja Hofmann -
2019 : Poster Session »
Matthia Sabatelli · Adam Stooke · Amir Abdi · Paulo Rauber · Leonard Adolphs · Ian Osband · Hardik Meisheri · Karol Kurach · Johannes Ackermann · Matt Benatan · GUO ZHANG · Chen Tessler · Dinghan Shen · Mikayel Samvelyan · Riashat Islam · Murtaza Dalal · Luke Harries · Andrey Kurenkov · Konrad Żołna · Sudeep Dasari · Kristian Hartikainen · Ofir Nachum · Kimin Lee · Markus Holzleitner · Vu Nguyen · Francis Song · Christopher Grimm · Felipe Leno da Silva · Yuping Luo · Yifan Wu · Alex Lee · Thomas Paine · Wei-Yang Qu · Daniel Graves · Yannis Flet-Berliac · Yunhao Tang · Suraj Nair · Matthew Hausknecht · Akhil Bagaria · Simon Schmitt · Bowen Baker · Paavo Parmas · Benjamin Eysenbach · Lisa Lee · Siyu Lin · Daniel Seita · Abhishek Gupta · Riley Simmons-Edler · Yijie Guo · Kevin Corder · Vikash Kumar · Scott Fujimoto · Adam Lerer · Ignasi Clavera Gilaberte · Nicholas Rhinehart · Ashvin Nair · Ge Yang · Lingxiao Wang · Sungryull Sohn · J. Fernando Hernandez-Garcia · Xian Yeow Lee · Rupesh Srivastava · Khimya Khetarpal · Chenjun Xiao · Luckeciano Carvalho Melo · Rishabh Agarwal · Tianhe Yu · Glen Berseth · Devendra Singh Chaplot · Jie Tang · Anirudh Srinivasan · Tharun Kumar Reddy Medini · Aaron Havens · Misha Laskin · Asier Mujika · Rohan Saphal · Joseph Marino · Alex Ray · Joshua Achiam · Ajay Mandlekar · Zhuang Liu · Danijar Hafner · Zhiwen Tang · Ted Xiao · Michael Walton · Jeff Druce · Ferran Alet · Zhang-Wei Hong · Stephanie Chan · Anusha Nagabandi · Hao Liu · Hao Sun · Ge Liu · Dinesh Jayaraman · John Co-Reyes · Sophia Sanborn -
2019 : Contributed Talks »
Kevin Lu · Matthew Hausknecht · Ofir Nachum -
2019 : The MineRL competition »
Misa Ogura · Joe Booth · Sophia Sun · Nicholay Topin · Brandon Houghton · William Guss · Stephanie Milani · Oriol Vinyals · Katja Hofmann · JIA KIM · Karolis Ramanauskas · Florian Laurent · Daichi Nishio · Anssi Kanervisto · Alexey Skrynnik · Artemij Amiranashvili · Christian Scheller · KAIXIN WANG · Yanick Schraner -
2019 Workshop: Machine Learning for Autonomous Driving »
Rowan McAllister · Nicholas Rhinehart · Fisher Yu · Li Erran Li · Anca Dragan -
2019 : Catered Lunch and Poster Viewing (in Workshop Room) »
Gustavo Stolovitzky · Prabhu Pradhan · Pablo Duboue · Zhiwen Tang · Aleksei Natekin · Elizabeth Bondi-Kelly · Xavier Bouthillier · Stephanie Milani · Heimo Müller · Andreas T. Holzinger · Stefan Harrer · Ben Day · Andrey Ustyuzhanin · William Guss · Mahtab Mirmomeni -
2019 Poster: Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck »
Maximilian Igl · Kamil Ciosek · Yingzhen Li · Sebastian Tschiatschek · Cheng Zhang · Sam Devlin · Katja Hofmann -
2019 Poster: On the Utility of Learning about Humans for Human-AI Coordination »
Micah Carroll · Rohin Shah · Mark Ho · Tom Griffiths · Sanjit Seshia · Pieter Abbeel · Anca Dragan -
2019 Poster: Better Exploration with Optimistic Actor Critic »
Kamil Ciosek · Quan Vuong · Robert Loftin · Katja Hofmann -
2019 Spotlight: Better Exploration with Optimistic Actor Critic »
Kamil Ciosek · Quan Vuong · Robert Loftin · Katja Hofmann -
2019 Poster: Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning »
David Janz · Jiri Hron · Przemysław Mazur · Katja Hofmann · José Miguel Hernández-Lobato · Sebastian Tschiatschek -
2019 Tutorial: Reinforcement Learning: Past, Present, and Future Perspectives »
Katja Hofmann -
2018 : Anca Dragan »
Anca Dragan -
2018 : How Players Speak to an Intelligent Game Character Using Natural Language Messages »
Katja Hofmann -
2018 : Opening Remark »
Li Erran Li · Anca Dragan -
2018 Workshop: NIPS Workshop on Machine Learning for Intelligent Transportation Systems 2018 »
Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese -
2018 : Anca Dragan »
Anca Dragan -
2018 Poster: Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior »
Sid Reddy · Anca Dragan · Sergey Levine -
2017 : Morning panel discussion »
Jürgen Schmidhuber · Noah Goodman · Anca Dragan · Pushmeet Kohli · Dhruv Batra -
2017 : "Communication via Physical Action" »
Anca Dragan -
2017 Workshop: 2017 NIPS Workshop on Machine Learning for Intelligent Transportation Systems »
Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese -
2017 : Invited talk: Robot Transparency as Optimal Control »
Anca Dragan -
2017 : Panel: "How can we characterise the landscape of intelligent systems and locate human-like intelligence in it?" »
Josh Tenenbaum · Gary Marcus · Katja Hofmann -
2017 : Katja Hofmann: 'Video games and the road to collaborative AI' »
Katja Hofmann -
2016 : Learning Reliable Objectives »
Anca Dragan -
2016 : Invited Talk: Autonomous Cars that Coordinate with People (Anca Dragan, Berkeley) »
Anca Dragan -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas -
2016 Poster: Cooperative Inverse Reinforcement Learning »
Dylan Hadfield-Menell · Stuart J Russell · Pieter Abbeel · Anca Dragan