Deep learning has flourished in the last decade. Recent breakthroughs have shown stunning results, and yet, researchers still cannot fully explain why neural networks generalise so well or why some architectures or optimizers work better than others. There is a lack of understanding of existing deep learning systems, which led NeurIPS 2017 test of time award winners Rahimi & Recht to compare machine learning with alchemy and to call for the return of the 'rigour police'.
Despite excellent theoretical work in the field, deep neural networks are so complex that they might not be able to be fully comprehended with theory alone. Unfortunately, the experimental alternative - rigorous work that neither proves a theorem nor proposes a new method - is currently under-valued in the machine learning community.
To change this, this workshop aims to promote the method of empirical falsification.
We solicit contributions which explicitly formulate a hypothesis related to deep learning or its applications (based on first principles or prior work), and then empirically falsify it through experiments. We further encourage submissions to go a layer deeper and investigate the causes of an initial idea not working as expected. This workshop will showcase how negative results offer important …
The recent advances in deep learning and artificial intelligence have equipped autonomous agents with increasing intelligence, which enables human-level performance in challenging tasks. In particular, these agents with advanced intelligence have shown great potential in interacting and collaborating with humans (e.g., self-driving cars, industrial robot co-worker, smart homes and domestic robots). However, the opaque nature of deep learning models makes it difficult to decipher the decision-making process of the agents, thus preventing stakeholders from readily trusting the autonomous agents, especially for safety-critical tasks requiring physical human interactions. In this workshop, we bring together experts with diverse and interdisciplinary backgrounds, to build a roadmap for developing and deploying trustworthy interactive autonomous systems at scale. Specifically, we aim to the following questions: 1) What properties are required for building trust between humans and interactive autonomous systems? How can we assess and ensure these properties without compromising the expressiveness of the models and performance of the overall systems? 2) How can we develop and deploy trustworthy autonomous agents under an efficient and trustful workflow? How should we transfer from development to deployment? 3) How to define standard metrics to quantify trustworthiness, from regulatory, theoretical, and experimental perspectives? How do we know that the …
As machine learning models find increasing use in the real world, ensuring their safe and reliable deployment depends on ensuring their robustness to distribution shift. This is especially true for sequential data, which occurs naturally in various data domains such as natural language processing, healthcare, computational biology, and finance. However, building models for sequence data which are robust to distribution shifts presents a unique challenge. Sequential data are often discrete rather than continuous, exhibit difficult to characterize distributions, and can display a much greater range of types of distributional shifts. Although many methods for improving model robustness exist for imaging or tabular data, extending these methods to sequential data is a challenging research direction that often requires fundamentally different techniques.
This workshop aims to facilitate progress towards improving the distributional robustness of models trained on sequential data by bringing together researchers to tackle a wide variety of research questions including, but not limited to:
(1) How well do existing robustness methods work on sequential data, and why do they succeed or fail?
(2) How can we leverage the sequential nature of the data to develop novel and distributionally robust methods?
(3) How do we construct and utilize formalisms for distribution …
The second version of the Efficient Natural Language and Speech Processing (ENLSP-II) workshop focuses on fundamental and challenging problems to make natural language and speech processing (especially pre-trained models) more efficient in terms of Data, Model, Training, and Inference. The workshop program offers an interactive platform for gathering different experts and talents from academia and industry through invited talks, panel discussion, paper submissions, reviews, interactive
posters, oral presentations and a mentorship program. This will be a unique opportunity to address the efficiency issues of current models, build connections, exchange ideas and brainstorm solutions, and foster future collaborations. The topics of this workshop can be of interest for people working on general machine learning, deep learning, optimization, theory and NLP & Speech applications.
Many cognitive and neural systems can be described in terms of compression and transmission of information given bounded resources. While information theory, as a principled mathematical framework for characterizing such systems, has been widely applied in neuroscience and machine learning, its role in understanding cognition has traditionally been contested. This traditional view has been changing in recent years, with growing evidence that information-theoretic optimality principles underlie a wide range of cognitive functions, including perception, working memory, language, and decision making. In parallel, there has also been a surge of contemporary information-theoretic approaches in machine learning, enabling large-scale neural-network implementation of information-theoretic models.
These scientific and technological developments open up new avenues for progress toward an integrative computational theory of human and artificial cognition, by leveraging information-theoretic principles as bridges between various cognitive functions and neural representations. This workshop aims to explore these new research directions and bring together researchers from machine learning, cognitive science, neuroscience, linguistics, economics, and potentially other fields, who are interested in integrating information-theoretic approaches that have thus far been studied largely independently of each other. In particular, we aim to discuss questions and exchange ideas along the following directions:
- Understanding human cognition: To what extent …
OPT 2022 will bring experts in optimization to share their perspectives while leveraging crossover experts in ML to share their views and recent advances. OPT 2022 honors this tradition of bringing together people from optimization and from ML in order to promote and generate new interactions between the two communities.
To foster the spirit of innovation and collaboration, a goal of this workshop, OPT 2022 will focus the contributed talks on research in Reliable Optimization Methods for ML. Many optimization algorithms for ML were originally developed with the goal of handling computational constraints (e.g., stochastic gradient based algorithms). Moreover, the analyses of these algorithms followed the classical optimization approach where one measures the performances of algorithms based on (i) the computation cost and (ii) convergence for any input into the algorithm. As engineering capabilities increase and the wide adoption of ML into many real world usages, practitioners of ML are seeking optimization algorithms that go beyond finding the minimizer with the fastest algorithm. They want reliable methods that solve real-world complications that arise. For example, increasingly bad actors are attempting to fool models with deceptive data. This leads to questions such as what algorithms are more robust to adversarial …
Self-Driving Materials Laboratories have greatly advanced the automation of material design and discovery. They require the integration of diverse fields and consist of three primary components, which intersect with many AI-related research topics:
- AI-Guided Design. This component intersects heavily with algorithmic research at NeurIPS, including (but not limited to) various topic areas such as: Reinforcement Learning and data-driven modeling of physical phenomena using Neural Networks (e.g. Graph Neural Networks and Machine Learning For Physics).
- Automated Chemical Synthesis. This component intersects significantly with robotics research represented at NeurIPS, and includes several parts of real-world robotic systems such as: managing control systems (e.g. Reinforcement Learning) and different sensor modalities (e.g. Computer Vision), as well as predictive models for various phenomena (e.g. Data-Based Prediction of Chemical Reactions).
- Automated Material Characterization. This component intersects heavily with a diverse set of supervised learning techniques that are well-represented at NeurIPS such as: computer vision for microscopy images and automated machine learning based analysis of data generated from different kinds of instruments (e.g. X-Ray based diffraction data for determining material structure).
Advances in machine learning owe much to the public availability of high-quality benchmark datasets and the well-defined problem settings that they encapsulate. Examples are abundant: CIFAR-10 for image classification, COCO for object detection, SQuAD for question answering, BookCorpus for language modelling, etc. There is a general belief that the accessibility of high-quality benchmark datasets is central to the thriving of our community.
However, three prominent issues affect benchmark datasets: data scarcity, privacy, and bias. They already manifest in many existing benchmarks, and also make the curation and publication of new benchmarks difficult (if not impossible) in numerous high-stakes domains, including healthcare, finance, and education. Hence, although ML holds strong promise in these domains, the lack of high-quality benchmark datasets creates a significant hurdle for the development of methodology and algorithms and leads to missed opportunities.
Synthetic data is a promising solution to the key issues of benchmark dataset curation and publication. Specifically, high-quality synthetic data generation could be done while addressing the following major issues.
1. Data Scarcity. The training and evaluation of ML algorithms require datasets with a sufficient sample size. Note that even if the algorithm can learn from very few samples, we still need sufficient validation data …
Transfer learning from large pre-trained language models (PLM) has become the de-facto method for a wide range of natural language processing tasks. Current transfer learning methods, combined with PLMs, have seen outstanding successes in transferring knowledge to new tasks, domains, and even languages. However, existing methods, including fine-tuning, in-context learning, parameter-efficient tuning, semi-parametric models with knowledge augmentation, etc., still lack consistently good performance across different tasks, domains, varying sizes of data resources, and diverse textual inputs.
This workshop aims to invite researchers from different backgrounds to share their latest work in efficient and robust transfer learning methods, discuss challenges and risks of transfer learning models when deployed in the wild, understand positive and negative transfer, and also debate over future directions.
We develop large models to “understand” images, videos and natural language that fuel many intelligent applications from text completion to self-driving cars. But tabular data has long been overlooked despite its dominant presence in data-intensive systems. By learning latent representations from (semi-)structured tabular data, pretrained table models have shown preliminary but impressive performance for semantic parsing, question answering, table understanding, and data preparation. As these early advances reveal a huge potential for making an impact on various downstream applications, the time has come to consider tabular data as a first-class modality for representation learning and stimulate advances in this direction.
The First Table Representation Learning workshop is the first workshop in this emerging research area and is centered around three main goals:
1) Motivate tabular data as primal modality for representation learning and further shaping this area.
2) Showcase impactful applications of pretrained table models and discussing future opportunities thereof.
3) Foster discussion and collaboration across the machine learning, natural language processing, and data management communities.
Speakers
Alon Halevy (keynote), Meta AI
Graham Neubig (keynote), Carnegie Mellon University
Carsten Binnig, TU Darmstadt
Bei Chen, Microsoft Research
Çağatay Demiralp, Sigma Computing
Huan Sun, Ohio State University
Xinyun Chen, Google Brain
Panelists …
Mathematical reasoning is a unique aspect of human intelligence and a fundamental building block for scientific and intellectual pursuits. However, learning mathematics is often a challenging human endeavor that relies on expert instructors to create, teach and evaluate mathematical material. From an educational perspective, AI systems that aid in this process offer increased inclusion and accessibility, efficiency, and understanding of mathematics. Moreover, building systems capable of understanding, creating, and using mathematics offers a unique setting for studying reasoning in AI. This workshop will investigate the intersection of mathematics education and AI.
Optimization is a cornerstone of nearly all modern machine learning (ML) and deep learning (DL). Simple first-order gradient-based methods dominate the field for convincing reasons: low computational cost, simplicity of implementation, and strong empirical results.
Yet second- or higher-order methods are rarely used in DL, despite also having many strengths: faster per-iteration convergence, frequent explicit regularization on step-size, and better parallelization than SGD. Additionally, many scientific fields use second-order optimization with great success.
A driving factor for this is the large difference in development effort. By the time higher-order methods were tractable for DL, first-order methods such as SGD and it’s main varients (SGD + Momentum, Adam, …) already had many years of maturity and mass adoption.
The purpose of this workshop is to address this gap, to create an environment where higher-order methods are fairly considered and compared against one-another, and to foster healthy discussion with the end goal of mainstream acceptance of higher-order methods in ML and DL.