NeurIPS 2022 Competition Track Program
Below you will find a brief summary of accepted competitions NeurIPS 2022.
Competitions are grouped by category, all prizes are tentative and depend solely on the organizing team of each competition and the corresponding sponsors. Please note that all information is subject to change, visit the competition websites regularly and contact the organizers of each competition directly for more information.
Special Topics in Machine Learning
OGB-LSC 2022: A Large-Scale Challenge for ML on Graphs
Weihua Hu (Stanford University); Matthias Fey (TU Dortmund); Hongyu Ren (Stanford University); Maho Nakata (RIKEN); Yuxiao Dong (Tsinghua); Jure Leskovec (Stanford University)
Enabling effective and efficient machine learning (ML) over large-scale graph data (e.g., graphs with billions of edges) can have a huge impact on both industrial and scientific applications. At KDD Cup 2021, we organized the OGB Large-Scale Challenge (OGB-LSC), where we provided large and realistic graph ML tasks. Our KDD Cup attracted huge attention from the graph ML community (more than 500 team registrations across the globe), facilitating innovative methods being developed to yield significant performance breakthroughs. However, the problem of machine learning over large graphs is not solved yet and it is important for the community to engage in a focused multi-year effort in this area (like ImageNet and MS-COCO). Here we propose an annual ML challenge around large-scale graph datasets, which will drive forward method development and allow for tracking progress. We propose the 2nd OGB-LSC (referred to as OGB-LSC 2022) around the OGB-LSC datasets. Our proposed challenge consists of three tracks, covering core graph ML tasks of node-level prediction (academic paper classification with 240 million nodes), link-level prediction (knowledge graph completion with 90 million entities), and graph-level prediction (molecular property prediction with 4 million graphs). Importantly, we have updated two out of the three datasets based on the lessons learned from our KDD Cup, so that the resulting datasets are more challenging and realistic. Our datasets are extensively validated through our baseline analyses and last year’s KDD Cup. We also provide the baseline code as well as Python package to easily load the datasets and evaluate the model performance.
Cross-Domain MetaDL: Any-Way Any-Shot Learning Competition with Novel Datasets from Practical Domains
Dustin Carrión (LISN/INRIA/CNRS, Université Paris-Saclay, France), Ihsan Ullah (LISN/INRIA/CNRS, Université Paris-Saclay, France), Sergio Escalera (Universitat de Barcelona and Computer Vision Center, Spain, and ChaLearn, USA), Isabelle Guyon (LISN/INRIA/CNRS, Université Paris-Saclay, France, and ChaLearn, USA), Felix Mohr (Universidad de La Sabana, Colombia), Manh Hung Nguyen (ChaLearn, USA), Joaquin Vanschoren (TU Eindhoven, the Netherlands).
Meta-learning aims to leverage the experience from previous tasks to solve new tasks using only little training data, train faster and/or get better performance. The proposed challenge focuses on "cross-domain meta-learning" for few-shot image classification using a novel "any-way" and "any-shot" setting. The goal is to meta-learn a good model that can quickly learn tasks from a variety of domains, with any number of classes also called "ways" (within the range 2-20) and any number of training examples per class also called "shots" (within the range 1-20). We carve such tasks from various "mother datasets" selected from diverse domains, such as healthcare, ecology, biology, manufacturing, and others. By using mother datasets from these practical domains, we aim to maximize the humanitarian and societal impact. The competition is with code submission, fully blind-tested on the CodaLab challenge platform. A single (final) submission will be evaluated during the final phase, using ten datasets previously unused by the meta-learning community. After the competition is over, it will remain active to be used as a long-lasting benchmark resource for research in this field. The scientific and technical motivations of this challenge include scalability, robustness to domain changes, and generalization ability to tasks (a.k.a. episodes) in different regimes (any-way any-shot).
AutoML for the 2020s: Diverse Tasks, Modern Methods, and Efficiency at Scale
Samuel Guo (Carnegie Mellon University), Cong Xu (Hewlett Packard Labs), Nicholas Roberts (University of Wisconsin-Madison), Mikhail Khodak (Carnegie Mellon University), Junhong Shen (Carnegie Mellon University), Evan Sparks (Hewlett Packard Enterprise), Ameet Talwalkar (Carnegie Mellon University), Yuriy Nevmyvaka (Morgan Stanley), Frederic Sala (University of Wisconsin-Madison), Kashif Rasul (Morgan Stanley), Anderson Schneider (Morgan Stanley)
As more areas beyond the traditional AI domains (e.g., computer vision and natural language processing) seek to take advantage of data-driven tools, the need for developing ML systems that can adapt to a wide range of downstream tasks in an efficient and automatic way continues to grow. The AutoML for the 2020s competition aims to catalyze research in this area and establish a benchmark for the current state of automated machine learning. Unlike previous challenges which focus on a single class of methods such as non-deep-learning AutoML, hyperparameter optimization, or meta-learning, this competition proposes to (1) evaluate automation on a diverse set of small and large-scale tasks, and (2) allow the incorporation of the latest methods such as neural architecture search and unsupervised pretraining. To this end, we curate 20 datasets that represent a broad spectrum of practical applications in scientific, technological, and industrial domains. Participants are given a set of 10 development tasks selected from these datasets and are required to come up with automated programs that perform well on as many problems as possible and generalize to the remaining private test tasks. To ensure efficiency, the evaluation will be conducted under a fixed computational budget. To ensure robustness, the performance profiles methodology is used for determining the winners. The organizers will provide computational resources to the participants as needed and monetary prizes to the winners.
The Trojan Detection Challenge
Mantas Mazeika (UIUC), Dan Hendrycks (UC Berkeley), Huichen Li (UIUC), Xiaojun Xu (UIUC), Sidney Hough (Stanford), Arezoo Rajabi (UW), Dawn Song (UC Berkeley), Radha Poovendran (UW), Bo Li (UIUC), David Forsyth (UIUC)
A growing concern for the security of ML systems is the possibility for Trojan attacks on neural networks. There is now considerable literature for methods detecting these attacks. We propose the Trojan Detection Challenge to further the community's understanding of methods to construct and detect Trojans. This competition will consist of complimentary tracks on detecting/analyzing Trojans and creating evasive Trojans. Participants will be tasked with devising methods to better detect Trojans using a new dataset containing over 6,000 neural networks. Code and evaluations from three established baseline detectors will provide a starting point, and a novel Minimal Trojan attack will challenge participants to push the state-of-the-art in Trojan detection. At the end of the day, we hope our competition spurs practical innovations and clarifies deep questions surrounding the offense-defense balance of Trojan attacks.
Causal Insights for Learning Paths in Education
Wenbo Gong (Microsoft Research), Digory Smith (Eedi), Zichao Wang (Rice University), Simon Woodhead (Eedi), Nick Pawlowski (Microsoft Research), Joel Jennings (Microsoft Research), Cheng Zhang (Microsoft Research) Craig Barton (Eedi)
In this competition, participants will address two fundamental causal challenges in machine learning in the context of education using time-series data. The first is to identify the causal relationships between different constructs, where a construct is defined as the smallest element of learning. The second challenge is to predict the impact of learning one construct on the ability to answer questions on other constructs. Addressing these challenges will enable optimisation of students' knowledge acquisition, which can be deployed in a real edtech solution impacting millions of students. Participants will run these tasks in an idealised environment with synthetic data and a real-world scenario with evaluation data collected from a series of A/B tests.
Natural Language Processing and Understanding
NL4Opt: Formulating Optimization Problems Based on Their Natural Language Descriptions
Rindranirina Ramamonjison (Huawei Technologies Canada), Amin Banitalebi-Dehkordi (Huawei Technologies Canada), Giuseppe Carenini (University of British Columbia), Bissan Ghaddar (Ivey Business School), Timothy Yu (Huawei Technologies Canada), Haley Li (University of British Columbia), Raymond Li (University of British Columbia), Zirui Zhou (Huawei Technologies Canada), Yong Zhang (Huawei Technologies Canada)
We propose a competition for extracting the meaning and formulation of an optimization problem based on its text description. For this competition, we have created the first dataset of linear programming (LP) word problems. A deep understanding of the problem description is an important first step toward generating the problem formulation. Therefore, we present two challenging sub-tasks for the participants. For the first sub-task, the goal is to recognize and label the semantic entities that correspond to the components of the optimization problem. For the second sub-task, the goal is to generate a meaningful representation (i.e. a logical form) of the problem from its description and its problem entities. This intermediate representation of an LP problem will be converted to a canonical form for evaluation. The proposed task will be attractive because of its compelling application, the low-barrier to the entry of the first sub-task, and the new set of challenges the second sub-task brings to semantic analysis and evaluation. The goal of this competition is to increase the access and usability of optimization solvers, allowing non-experts to solve important problems from various industries. In addition, this new task will promote the development of novel machine learning applications and datasets for operations research.
IGLU: Interactive Grounded Language Understanding in a Collaborative Environment
Julia Kiseleva (MSR), Alexey Skrynnik (MIPT), Artem Zholus (MIPT), Shrestha Mohanty (Microsoft), Negar Arabzadeh (University of Waterloo), Marc-Alexandre Côté (MSR), Mohammad Aliannejadi (University of Amsterdam), Milagro Teruel (MSR), Ziming Li (Amazon Alexa), Mikhail Burtsev (MIPT), Maartje ter Hoeve (University of Amsterdam), Zoya Volovikova (MIPT), Aleksandr Panov (MIPT), Yuxuan Sun (Meta AI), Kavya Srinet (Meta AI), Arthur Szlam (Meta AI), Ahmed Awadallah (MSR)