Timezone: »
Evaluating the general abilities of intelligent agents requires complex simulation environments. Existing benchmarks typically evaluate only one narrow task per environment, requiring researchers to perform expensive training runs on many different environments. We introduce Crafter, an open world survival game with visual inputs that evaluates a wide range of general abilities within a single environment. Agents either learn from the provided reward signal or through intrinsic objectives and are evaluated by semantically meaningful achievements that can be unlocked during each episode, such as discovering resources and crafting tools. Consistently unlocking all achievements requires strong generalization, deep exploration, and long-term reasoning. We experimentally verify that Crafter is of appropriate difficulty to drive future research and provide baselines scores of reward agents and unsupervised agents. Furthermore, we observe sophisticated behaviors emerging from maximizing the reward signal, such as building tunnel systems, bridges, houses, and plantations. We hope that Crafter will accelerate research progress by quickly evaluating a wide spectrum of abilities.
Author Information
Danijar Hafner (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : Benchmarking the Spectrum of Agent Capabilities »
Mon. Dec 13th 05:45 -- 05:57 PM Room
More from the Same Authors
-
2021 : Learning Robust Dynamics through Variational Sparse Gating »
Arnav Kumar Jain · Shivakanth Sujit · Shruti Joshi · Vincent Michalski · Danijar Hafner · Samira Ebrahimi Kahou -
2021 : Benchmarking the Spectrum of Agent Capabilities Q&A »
Danijar Hafner -
2021 Poster: Discovering and Achieving Goals via World Models »
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak -
2021 Poster: Clockwork Variational Autoencoders »
Vaibhav Saxena · Jimmy Ba · Danijar Hafner -
2021 Poster: Information is Power: Intrinsic Control via Information Capture »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2019 Poster: Bayesian Layers: A Module for Neural Network Uncertainty »
Dustin Tran · Mike Dusenberry · Mark van der Wilk · Danijar Hafner