Timezone: »

The MineRL BASALT Competition on Fine-tuning from Human Feedback
Anssi Kanervisto · Stephanie Milani · Karolis Ramanauskas · Byron Galbraith · Steven Wang · Brandon Houghton · Sharada Mohanty · Rohin Shah

Tue Dec 06 03:00 AM -- 06:00 AM (PST) @ Virtual
Event URL: https://www.aicrowd.com/challenges/neurips-2022-minerl-basalt-competition »

Given the impressive capabilities demonstrated by pre-trained foundation models, we must now grapple with how to harness these capabilities towards useful tasks. Since many such tasks are hard to specify programmatically, researchers have turned towards a different paradigm: fine-tuning from human feedback. The MineRL BASALT competition aims to spur research on this important class of techniques, in the domain of the popular video game Minecraft.The competition consists of a suite of four tasks with hard-to-specify reward functions.We define these tasks by a paragraph of natural language: for example, "create a waterfall and take a scenic picture of it", with additional clarifying details. Participants train a separate agent for each task, using any method they want; we expect participants will choose to fine-tune the provided pre-trained models. Agents are then evaluated by humans who have read the task description. To help participants get started, we provide a dataset of human demonstrations of the four tasks, as well as an imitation learning baseline that leverages these demonstrations.We believe this competition will improve our ability to build AI systems that do what their designers intend them to do, even when intent cannot be easily formalized. This achievement will allow AI to solve more tasks, enable more effective regulation of AI systems, and make progress on the AI alignment problem.

Author Information

Anssi Kanervisto (Microsoft Research)
Stephanie Milani (Carnegie Mellon University)
Karolis Ramanauskas (University of Bath)
Karolis Ramanauskas

PhD Student in Reinforcement Learning

Byron Galbraith (Seva)

Byron Galbraith is the CTO of Seva, where he works to translate the latest advancements in machine learning and natural language processing to build AI-powered conversational agents. Byron has a PhD in Cognitive and Neural Systems from Boston University and an MS in Bioinformatics from Marquette University. His research expertise includes brain-computer interfaces, neuromorphic robotics, spiking neural networks, high-performance computing, and natural language processing. Byron has also held several software engineering roles including back-end system engineer, full stack web developer, office automation consultant, and game engine developer at companies ranging in size from a two-person startup to a multi-national enterprise.

Steven Wang (UC Berkeley)
Brandon Houghton (OpenAI)
Sharada Mohanty (AIcrowd SA)
Rohin Shah (DeepMind)

More from the Same Authors