Timezone: »
StarCraft II is one of the most challenging reinforcement learning (RL) environments; it is partially observable, stochastic, and multi-agent, and mastering StarCraft II requires strategic planning over long-time horizons with real-time low-level execution. It also has an active human competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because a massive dataset of millions of StarCraft II games played by human players has been released by Blizzard. This paper leverages that and establishes a benchmark, which we call StarCraft II Unplugged, that introduces unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard’s release), tools standardising an API for ML methods, and an evaluation protocol. We also present baseline agents, including behaviour cloning, and offline variants of V-trace actor-critic and MuZero. We find that the variants of those algorithms with behaviour value estimation and single step policy improvement work best and exceed 90% win rate against previously published AlphaStar behaviour cloning agents.
Author Information
Michael Mathieu (DeepMind)
Sherjil Ozair (DeepMind)
Srivatsan Srinivasan (Google)
Caglar Gulcehre (Deepmind)
Shangtong Zhang (University of Oxford)
Ray Jiang (DeepMind)
Tom Paine
Konrad Żołna (DeepMind)
Julian Schrittwieser (DeepMind)
David Choi (DeepMind)
Petko I Georgiev (Google DeepMind)
Daniel Toyama (DeepMind)
Roman Ring (University of Tartu)
Igor Babuschkin (DeepMind)
Timo Ewalds (Deepmind)
Aaron van den Oord (Google Deepmind)
Wojciech Czarnecki (DeepMind)
Nando de Freitas (UBC)
Oriol Vinyals (Google DeepMind)
More from the Same Authors
-
2021 Spotlight: Online and Offline Reinforcement Learning by Planning with a Learned Model »
Julian Schrittwieser · Thomas Hubert · Amol Mandhane · Mohammadamin Barekatain · Ioannis Antonoglou · David Silver -
2022 Poster: Intra-agent speech permits zero-shot task acquisition »
Chen Yan · Federico Carnevale · Petko I Georgiev · Adam Santoro · Aurelia Guy · Alistair Muldal · Chia-Chun Hung · Joshua Abramson · Timothy Lillicrap · Gregory Wayne -
2022 Poster: Truncated Emphatic Temporal Difference Methods for Prediction and Control »
Shangtong Zhang · Shimon Whiteson -
2021 Poster: Multimodal Few-Shot Learning with Frozen Language Models »
Maria Tsimpoukelli · Jacob L Menick · Serkan Cabi · S. M. Ali Eslami · Oriol Vinyals · Felix Hill -
2021 Poster: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer »
Ge Yang · Edward Hu · Igor Babuschkin · Szymon Sidor · Xiaodong Liu · David Farhi · Nick Ryder · Jakub Pachocki · Weizhu Chen · Jianfeng Gao -
2021 Poster: Active Offline Policy Selection »
Ksenia Konyushova · Yutian Chen · Thomas Paine · Caglar Gulcehre · Cosmin Paduraru · Daniel Mankowitz · Misha Denil · Nando de Freitas -
2021 Poster: Online and Offline Reinforcement Learning by Planning with a Learned Model »
Julian Schrittwieser · Thomas Hubert · Amol Mandhane · Mohammadamin Barekatain · Ioannis Antonoglou · David Silver -
2020 : Multi-Format Contrastive Learning of Audio Representations »
Aaron van den Oord -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: Learning Retrospective Knowledge with Reverse Reinforcement Learning »
Shangtong Zhang · Vivek Veeriah · Shimon Whiteson -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Session »
Matthia Sabatelli · Adam Stooke · Amir Abdi · Paulo Rauber · Leonard Adolphs · Ian Osband · Hardik Meisheri · Karol Kurach · Johannes Ackermann · Matt Benatan · GUO ZHANG · Chen Tessler · Dinghan Shen · Mikayel Samvelyan · Riashat Islam · Murtaza Dalal · Luke Harries · Andrey Kurenkov · Konrad Żołna · Sudeep Dasari · Kristian Hartikainen · Ofir Nachum · Kimin Lee · Markus Holzleitner · Vu Nguyen · Francis Song · Christopher Grimm · Felipe Leno da Silva · Yuping Luo · Yifan Wu · Alex Lee · Thomas Paine · Wei-Yang Qu · Daniel Graves · Yannis Flet-Berliac · Yunhao Tang · Suraj Nair · Matthew Hausknecht · Akhil Bagaria · Simon Schmitt · Bowen Baker · Paavo Parmas · Benjamin Eysenbach · Lisa Lee · Siyu Lin · Daniel Seita · Abhishek Gupta · Riley Simmons-Edler · Yijie Guo · Kevin Corder · Vikash Kumar · Scott Fujimoto · Adam Lerer · Ignasi Clavera Gilaberte · Nicholas Rhinehart · Ashvin Nair · Ge Yang · Lingxiao Wang · Sungryull Sohn · J. Fernando Hernandez-Garcia · Xian Yeow Lee · Rupesh Srivastava · Khimya Khetarpal · Chenjun Xiao · Luckeciano Carvalho Melo · Rishabh Agarwal · Tianhe Yu · Glen Berseth · Devendra Singh Chaplot · Jie Tang · Anirudh Srinivasan · Tharun Kumar Reddy Medini · Aaron Havens · Misha Laskin · Asier Mujika · Rohan Saphal · Joseph Marino · Alex Ray · Joshua Achiam · Ajay Mandlekar · Zhuang Liu · Danijar Hafner · Zhiwen Tang · Ted Xiao · Michael Walton · Jeff Druce · Ferran Alet · Zhang-Wei Hong · Stephanie Chan · Anusha Nagabandi · Hao Liu · Hao Sun · Ge Liu · Dinesh Jayaraman · John Co-Reyes · Sophia Sanborn -
2019 Workshop: Science meets Engineering of Deep Learning »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 Poster: Wasserstein Dependency Measure for Representation Learning »
Sherjil Ozair · Corey Lynch · Yoshua Bengio · Aaron van den Oord · Sergey Levine · Pierre Sermanet -
2019 Poster: Unsupervised State Representation Learning in Atari »
Ankesh Anand · Evan Racah · Sherjil Ozair · Yoshua Bengio · Marc-Alexandre Côté · R Devon Hjelm -
2019 Poster: Shaping Belief States with Generative Environment Models for RL »
Karol Gregor · Danilo Jimenez Rezende · Frederic Besse · Yan Wu · Hamza Merzic · Aaron van den Oord -
2019 Poster: Generating Diverse High-Fidelity Images with VQ-VAE-2 »
Ali Razavi · Aaron van den Oord · Oriol Vinyals -
2019 Demonstration: The Option Keyboard: Combining Skills in Reinforcement Learning »
Daniel Toyama · Shaobo Hou · Gheorghe Comanici · Andre Barreto · Doina Precup · Shibl Mourad · Eser Aygün · Philippe Hamel -
2019 Poster: DAC: The Double Actor-Critic Architecture for Learning Options »
Shangtong Zhang · Shimon Whiteson -
2019 Poster: The Option Keyboard: Combining Skills in Reinforcement Learning »
Andre Barreto · Diana Borsa · Shaobo Hou · Gheorghe Comanici · Eser Aygün · Philippe Hamel · Daniel Toyama · jonathan j hunt · Shibl Mourad · David Silver · Doina Precup -
2019 Poster: Generalized Off-Policy Actor-Critic »
Shangtong Zhang · Wendelin Boehmer · Shimon Whiteson -
2018 : Coffee Break and Poster Session I »
Pim de Haan · Bin Wang · Dequan Wang · Aadil Hayat · Ibrahim Sobh · Muhammad Asif Rana · Thibault Buhet · Nicholas Rhinehart · Arjun Sharma · Alex Bewley · Michael Kelly · Lionel Blondé · Ozgur S. Oguz · Vaibhav Viswanathan · Jeroen Vanbaar · Konrad Żołna · Negar Rostamzadeh · Rowan McAllister · Sanjay Thakur · Alexandros Kalousis · Chelsea Sidrane · Sujoy Paul · Daphne Chen · Michal Garmulewicz · Henryk Michalewski · Coline Devin · Hongyu Ren · Jiaming Song · Wen Sun · Hanzhang Hu · Wulong Liu · Emilie Wirbel -
2018 Poster: The challenge of realistic music generation: modelling raw audio at scale »
Sander Dieleman · Aaron van den Oord · Karen Simonyan -
2017 Poster: Neural Discrete Representation Learning »
Aaron van den Oord · Oriol Vinyals · koray kavukcuoglu -
2017 Poster: Plan, Attend, Generate: Planning for Sequence-to-Sequence Models »
Caglar Gulcehre · Francis Dutil · Adam Trischler · Yoshua Bengio -
2016 Poster: Conditional Image Generation with PixelCNN Decoders »
Aaron van den Oord · Nal Kalchbrenner · Lasse Espeholt · koray kavukcuoglu · Oriol Vinyals · Alex Graves -
2016 Poster: Disentangling factors of variation in deep representation using adversarial training »
Michael Mathieu · Junbo Jake Zhao · Junbo (Jake) Zhao · Aditya Ramesh · Pablo Sprechmann · Yann LeCun -
2015 : Nando de Freitas »
Nando de Freitas -
2015 Poster: Learning to Linearize Under Uncertainty »
Ross Goroshin · Michael Mathieu · Yann LeCun -
2015 Demonstration: Vitruvian Science: a visual editor for quickly building neural networks in the cloud »
Markus Beissinger · Sherjil Ozair -
2014 Poster: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2014 Poster: Generative Adversarial Nets »
Ian Goodfellow · Jean Pouget-Abadie · Mehdi Mirza · Bing Xu · David Warde-Farley · Sherjil Ozair · Aaron Courville · Yoshua Bengio -
2014 Oral: Sequence to Sequence Learning with Neural Networks »
Ilya Sutskever · Oriol Vinyals · Quoc V Le -
2014 Poster: Factoring Variations in Natural Images with Deep Gaussian Mixture Models »
Aaron van den Oord · Benjamin Schrauwen -
2013 Demonstration: Deep Content-Based Music Recommendation »
Aaron van den Oord · Sander Dieleman · Benjamin Schrauwen -
2013 Poster: Predicting Parameters in Deep Learning »
Misha Denil · Babak Shakibi · Laurent Dinh · Marc'Aurelio Ranzato · Nando de Freitas -
2013 Poster: Deep content-based music recommendation »
Aaron van den Oord · Sander Dieleman · Benjamin Schrauwen -
2012 Poster: Learning with Recursive Perceptual Representations »
Oriol Vinyals · Yangqing Jia · Li Deng · Trevor Darrell