Timezone: »

Competition Track Day 1
Douwe Kiela · Barbara Caputo

Tue Dec 07 10:00 AM -- 12:05 PM (PST) @ None
Event URL: https://neurips.cc/Conferences/2021/CompetitionTrack »

The program includes exciting competitions ranging from zero-resource speech, to interactive grounded language understanding, to designing new catalysts for renewable energy, to name just a few. The program includes a wide variety of domains, with some focusing more on applications and others trying to unify fields, or focusing on technical challenges like understanding the fidelity of approximate inference. We hope that the broad program makes it so that anyone who wants to work on a competition can find something to their liking.

Tue 10:00 a.m. - 10:05 a.m.
Introduction Competion Day 1 (Intro)
Douwe Kiela
Tue 10:05 a.m. - 10:25 a.m.

Advancements to renewable energy processes are urgently needed to address climate change and energy scarcity around the world. Many of these processes, including the generation of electricity through fuel cells or fuel generation from renewable resources are driven through chemical reactions. The design of new catalysts for enabling new and more efficient reactions is a critical bottleneck in developing cost-effective solutions. Unfortunately, the discovery of new catalyst materials is limited due to the high cost of computational atomic simulations and experimental studies. Machine learning has the potential to significantly reduce the cost of computational simulations by orders of magnitude. By filtering potential catalyst materials based on these simulations, candidates of higher promise may be selected for experimental testing and the rate at which new catalysts are discovered could be greatly increased.

The Open Catalyst Challenge invites participants to submit results of machine learning models that simulate the interaction of a molecule on a catalyst's surface. ML models may either directly predict the relaxed state atomic configuration of the entire molecule + catalyst system, or iteratively predict and integrate per-atom forces to simulate how atoms will move around starting from an arbitrary initial state. By predicting this interaction accurately, the catalyst's impact on the overall rate of a chemical reaction may be estimated; a key factor in filtering potential catalysis materials and addressing the world’s energy needs.

The Open Catalyst Project is a collaborative research effort between Facebook AI Research (FAIR) and Carnegie Mellon University’s (CMU) Department of Chemical Engineering. The aim is to use AI to model and discover new catalysts for use in renewable energy storage to help in addressing climate change.

Tue 10:25 a.m. - 10:45 a.m.

Mismatch between training and deployment data, known as distributional shift, adversely impacts ML models and is ubiquitous in real, industrial applications. In this competition the contestants’ goal is to develop models which are both robust to distributional shift and can detect it via uncertainty estimation. The broad aim of this competition is to raise awareness of the issue and stimulate the community to work on tasks and modalities taken from large-scale industrial applications. Thus, we provide the "Shifts Dataset" - a new, large dataset of genuine `in the wild' examples of distributional shift from weather prediction, machine translation, and vehicle motion prediction. Each task represents a particular data-modality and is uniquely challenging. Each task will have an associated competition track with prizes for top contestants.

Tue 10:45 a.m. - 11:05 a.m.

Progress in machine learning is typically measured by training and testing a model on the same distribution of data, i.e., the same domain. However, in real world applications, models often encounter out-of-distribution data. The VisDA21 competition invites methods that can adapt to novel test distributions and handle distributional shifts. Our task is object classification, but we measure accuracy on novel domains, rather than the traditional in-domain benchmarking. Teams will be given labeled source data and unlabeled target data from a different distribution (such as novel viewpoints, backgrounds, image quality). In addition, the target data may have missing and/or novel classes. Successful approaches will improve classification accuracy of known categories on target-domain data while learning to deal with missing and/or unknown categories.

Tue 11:05 a.m. - 11:25 a.m.

Humans can infer a wide range of properties from a perceived sound, such as information about the source (e.g. what generated the sound? where is it coming from?), the information the sound conveys (this is a word that means X, this is a musical note in scale Y), and how it compares to other sounds (these two sounds come/don't come from the same source and are/aren't identical). Can any one learned representation do the same? The aim of this competition is to develop a general-purpose audio representation that provides a meaningful basis for learning in a wide variety of tasks and scenarios. We challenge participants with the following questions: Is it possible to develop a single representation that models all psychoacoustic phenomena? What approach best generalizes to a wide range of downstream audio tasks without fine-tuning? What audio representation allows researchers to formulate and solve novel and societally-valuable problems in simple, repeatable ways? We will evaluate audio representations using a benchmark suite across a variety of domains, including speech, environmental sound, medical audio, and music. In the spirit of shared exchange, all participants must submit an audio embedding model, following a common API, that is general-purpose, open-source, and freely available to use.

Tue 11:25 a.m. - 11:45 a.m.

WebQA is a new benchmark for multimodal multihop reasoning in which systems are presented with the same style of data as humans when searching the web: snippets and images. Upon seeing a question, the system must identify which candidates potentially inform the answer from a candidate pool. Then the system is expected to aggregate information from selected candidates with reasoning to generate an answer in natural language form. Each datum is a question paired with a series of potentially long snippets or images that serve as "knowledge carriers" over which to reason. Systems will be evaluated on both supporting fact retrieval and answer generation to measure correctness and interpretability. To demonstrate multihop multimodal reasoning ability, models should be able to 1) understand and represent knowledge from different modalities, 2) identify and aggregate relevant knowledge fragments scattered across multiple sources, 3) make inference and do natural language generation.

Tue 11:45 a.m. - 12:05 p.m.

In the third MineRL Diamond competition, participants continue to develop algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve a complex task in Minecraft. The competition environment features sparse-rewards, long-term planning, vision and sub-task hierarchies. To ensure that truly sample-efficient are developed, organizers re-train submitted systems on a fixed cloud-computing environment for a limited number of samples (4 days or 8 million samples). To ease the entry to machine learning research, the competition features two tracks: introduction, which allows agents developed using any method ranging from end-to-end machine learning solutions to programmatic approaches; and research, which requires participants develop novel imitation and reinforcement learning algorithms to solve this difficult sample-limited task.


We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient’s lungs according to a trajectory of airway pressures specified by a clinician.

Hand-tuned PID controllers and similar variants have comprised the industry standard for decades, yet can behave poorly by over- or under-shooting their target or oscillating rapidly. In this competition, we have designed two tracks for a data-driven machine-learning approach: 1) train a model that simulates a ventilator-artificial lung system (the physical system) using data collected from the real system and 2) train a controller to control the real system.

In a first software-only phase, we will be providing inputs and sensor readings along with pre-trained simulators. In the second phase, participants will have regulated remote access to the physical system and will collect their own data. We have provided two baselines here. A state-of-art result from this competition represents a strong candidate for further testing in commercial ventilators and more sophisticated artificial lungs.

Author Information

Douwe Kiela (Facebook AI Research)
Barbara Caputo (Politecnico di Torino)

More from the Same Authors