Nov. 28, 2022, 7:20 a.m.

Among the thousands of human languages used throughout the world, NLP researchers have so far focused on only a handful. This is understandable from the perspective that resources and researchers are not readily available for all languages, but nevertheless it is a profound limitation of our research community, one that must be addressed. I will discuss research on Korean and other low- to medium-resource languages and share the interesting findings that extend beyond the linguistic differences. I will share our work on ethnic bias in BERT language models in six different languages which particularly illustrates the importance of studying multiple languages. I will describe our efforts in building a benchmark dataset for Korean and the main challenge of building the dataset when the sources of data are much smaller compared to English and other major languages. I will also share some preliminary results of working with non-native speakers who can potentially contribute to research in low-resource languages. Through this talk, I hope to inspire NLP researchers, myself included, to actively engage in a diverse set of languages and cultures.


Alice Oh

I am a professor at KAIST in the School of Computing with joint appointment in the Graduate School of AI. My research interests are in developing and applying machine learning models for natural language processing. In our research group, we look at various data such as news, social media, Wikipedia, and programming education.

Invited Talk: Invited Talk

Nov. 28, 2022, 7:30 a.m.


Nov. 28, 2022, 8:25 a.m.


Raesetje Sefala

Raesetje is an AI Research Fellow who uses Computer Vision, Data Science and general Machine Learning techniques to mainly explore research questions with a societal impact. Her research focuses on creating ground truth datasets and using machine learning and other computational social science techniques to study spatial segregation in South Africa, post-Apartheid.

Raesetje is a qualified Data Scientist and holds a Computer Science Masters degree from the University of the Witwatersrand with a special focus on Machine learning. She has been technically involved in different complex Data Science projects from around the world that involved building innovative solutions.

She is mainly interested in using AI to solve problems experienced in developing countries; creating and analysing datasets for machine learning designing and developing efficient data science/machine learning pipelines for different types of datasets, and contributing to making better data and technology policies that serve communities.

Nov. 28, 2022, 11:20 a.m.


Bianca Zadrozny

Nov. 28, 2022, noon


Nov. 28, 2022, 12:35 p.m.

As predictive models are increasingly being employed to make consequential decisions in various real-world applications, it becomes important to ensure that relevant stakeholders and decision makers correctly understand the functionality of these models so that they can diagnose errors and potential biases in them, and decide when and how to employ these models. To this end, recent research in AI/ML has focused on developing techniques which aim to explain complex models to relevant stakeholders. In this talk, I will give a brief overview of the field of explainable AI while highlighting our research in this area. More specifically, I will discuss our work on: (a) developing inherently interpretable models and post hoc explanation methods, (b) identifying the vulnerabilities and shortcomings of these methods, and addressing them, (c) evaluating the reliability (correctness, robustness, fairness) and human understanding of the explanations output by these methods, and (d) theoretical results on unifying these methods. I will conclude this talk by shedding light on some exciting future research directions – e.g., rethinking model explainability as a (natural language) dialogue between humans and AI, and redesigning explainable AI tools to cater to large pretrained models.


Himabindu Lakkaraju

Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io/

Nov. 28, 2022, 1 p.m.


Invited Talk: Getting started with JAX

Nov. 28, 2022, 2 p.m.


Nov. 28, 2022, 3:15 p.m.

There has recently been widespread discussion of whether GPT-3, LaMDA 2, and related large language models might be sentient. Should we take this idea seriously? I will discuss the underlying issue and will break down the strongest reasons for and against.


David Chalmers

David Chalmers is University Professor of Philosophy and Neural Science and co-director of the Center for Mind, Brain, and Consciousness at New York University. He is the author of THE CONSCIOUS MIND (1996), CONSTRUCTING THE WORLD (2010), and REALITY+ (2022). He is current president of the American Philosophical Association (Eastern Division). He co-founded the Association for the Scientific Study of Consciousness and the PhilPapers Foundation. He has given the John Locke Lectures and has been awarded the Jean Nicod Prize. He is known for formulating the “hard problem” of consciousness, which inspired Tom Stoppard’s play The Hard Problem; for the idea of the “extended mind,” which says that the tools we use can become parts of our minds; and for influential work on language and learning in neural network models and on other foundational issues in AI.

Nov. 29, 2022, 7:30 a.m.


Rediet Abebe

Nov. 29, 2022, 12:30 p.m.

Conformal inference methods are becoming all the rage in academia and industry alike. In a nutshell, these methods deliver exact prediction intervals for future observations without making any distributional assumption whatsoever other than having iid, and more generally, exchangeable data. This talk will review the basic principles underlying conformal inference and survey some major contributions that have occurred in the last 2-3 years or. We will discuss enhanced conformity scores applicable to quantitative as well as categorical labels. We will also survey novel methods which deal with situations, where the distribution of observations can shift drastically — think of finance or economics where market behavior can change over time in response to new legislation or major world events, or public health where changes occur because of geography and/or policies. All along, we shall illustrate the methods with examples including the prediction of election results or COVID19-case trajectories.


Emmanuel Candes

Nov. 29, 2022, 4 p.m.

Abstract: Competitions are set to play a bigger role in research to advance the state-of-the-art in algorithms and help solve scientific and societal challenges. Over the last 15 years, challenges in Machine Learning, Data Science and Artificial Intelligence have proven useful in education to train students and in industry to upskill workers or bring solutions otherwise confined to research. However, with the fast-paced advances in AI, what is the future role of research competitions?

In the talk, we will reflect upon the past 6 years of NeurIPS competitions, highlighting how research competitions are a powerful means of prototyping solutions to novel problems. We will propose an evolution towards engaging a broader diversity of organizers and participants and addressing real world problems experienced by global and local communities. Such evolution may lead the community to rethink how we monitor progress on those fronts and how we can better leverage the wealth of competition outcomes.

Isabelle Guyon recently joined Google Brain as a research scientist. She is also professor of artificial intelligence at Université Paris-Saclay (Orsay). Her areas of expertise include computer vision, bioinformatics, and power systems. She is best known for being a co-inventor of Support Vector Machines. Her recent interests are in automated machine learning, meta-learning, and data-centric AI. She has been a strong promoter of challenges and benchmarks, and is president of ChaLearn, a non-profit dedicated to organizing machine learning challenges. She is community lead of Codalab competitions, a challenge platform used both in academia and industry, which started as a Microsoft research project, under Evelyne Viegas. She co-organized the “Challenges in Machine Learning Workshop” @ NeurIPS between 2014 and 2019 with Evelyne Viegas and others, launched the "NeurIPS challenge track" in 2017 while she was general chair, and pushed the creation of the "NeurIPS datasets and benchmark track" in 2021, as a NeurIPS board member.

Evelyne Viegas is Senior Director of Global Research Engagement at Microsoft Research. She drives the research engagement strategy, designing a portfolio of programs which are open and collaborative with the external research community. She leads initiatives to create social impact through strategic partnerships which accelerate research and technology breakthroughs, impact from research to business and diversification of research partners, ideas and portfolio in the areas of Artificial Intelligence, Computing Systems and Experiences, working in partnership with the academic community, engineering and business groups, industry partners and government agencies worldwide. She co-funded the Challenges in Machine Learning (CiML) Workshop with Isabelle Guyon and co-organized the CiML series at NeurIPS between 2014 and 2019 with her and others to bring the community of competition organizers and participants together to share their efforts and work together on developing the science behind the competitions.


Isabelle Guyon

Isabelle Guyon recently joined Google Brain as a research scientist. She is also professor of artificial intelligence at Université Paris-Saclay (Orsay). Her areas of expertise include computer vision, bioinformatics, and power systems. She is best known for being a co-inventor of Support Vector Machines. Her recent interests are in automated machine learning, meta-learning, and data-centric AI.  She has been a strong promoter of challenges and benchmarks, and is president of ChaLearn, a non-profit dedicated to organizing machine learning challenges. She is community lead of Codalab competitions, a challenge platform used both in academia and industry. She co-organized the “Challenges in Machine Learning Workshop” @ NeurIPS between 2014 and 2019, launched the "NeurIPS challenge track" in 2017 while she was general chair, and pushed the creation of the "NeurIPS datasets and benchmark track" in 2021, as a NeurIPS board member.

Invited Talk: Interaction-Centric AI

Nov. 30, 2022, 7:30 a.m.

Remarkable model performance makes news headlines and compelling demos, but these advances rarely translate to a lasting impact on real-world users. A common anti-pattern is overlooking the dynamic, complex, and unexpected ways humans interact with AI, which in turn limits the adoption and usage of AI in practical contexts. To address this, I argue that human-AI interaction should be considered a first-class object in designing AI applications.

In this talk, I present a few novel interactive systems that use AI to support complex real-life tasks. I discuss tensions and solutions in designing human-AI interaction, and critically reflect on my own research to share hard-earned design lessons. Factors such as user motivation, coordination between stakeholders, social dynamics, and user’s and AI’s adaptivity to each other often play a crucial role in determining the user experience of AI, even more so than model accuracy. My call to action is that we need to establish robust building blocks for “Interaction-Centric AI”—a systematic approach to designing and engineering human-AI interaction that complements and overcomes the limitations of model- and data-centric views.


Juho Kim

Nov. 30, 2022, 12:30 p.m.

Among the great challenges posed to democracy today is the use of technology, data, and automated systems in ways that threaten the rights of the American public. Too often, these tools are used to limit our opportunities and prevent our access to critical resources or services. These problems are well documented. In America and around the world, systems supposed to help with patient care have proven unsafe, ineffective, or biased. Algorithms used in hiring and credit decisions have been found to reflect and reproduce existing unwanted inequities or embed new harmful bias and discrimination. Unchecked social media data collection has been used to threaten people’s opportunities, undermine their privacy, or pervasively track their activity—often without their knowledge or consent.

These outcomes are deeply harmful—but they are not inevitable. Automated systems have brought about extraordinary benefits, from technology that helps farmers grow food more efficiently and computers that predict storm paths, to algorithms that can identify diseases in patients. These tools now drive important decisions across sectors, while data is helping to revolutionize global industries. Fueled by the power of American innovation, these tools hold the potential to redefine every part of our society and make life better for everyone.

This important progress must not come at the price of civil rights or democratic values, foundational American principles that President Biden has affirmed as a cornerstone of his Administration. To advance President Biden’s vision, the White House Office of Science and Technology Policy has identified five principles that should guide the design, use, and deployment of automated systems to protect the American public in the age of artificial intelligence.

The Blueprint for an AI Bill of Rights is a guide for a society that protects all people from these threats—and uses technologies in ways that reinforce our highest values. Responding to the experiences of the American public, and informed by insights from researchers, technologists, advocates, journalists, and policymakers, this framework is accompanied by From Principles to Practice—a handbook for anyone seeking to incorporate these protections into policy and practice, including detailed steps toward actualizing these principles in the technological design process. These principles help provide guidance whenever automated systems can meaningfully impact the public’s rights, opportunities, or access to critical needs.


Alondra Nelson

Alondra Nelson, Ph.D., (NAM) is the Harold F. Linder Professor at the Institute for Advanced Study. She currently serves as Deputy Assistant to the President and Deputy Director for Science and Society in the White House Office of Science and Technology Policy, where she performed the duties of the Director from February to October 2022. Dr. Nelson is most widely known for her research at the intersection of science, technology, medicine, and social inequality, and as the acclaimed author of award-winning books, including The Social Life of DNA: Race, Reparations, and Reconciliation after the Genome (2016); Body and Soul: The Black Panther Party and the Fight against Medical Discrimination (2011); Genetics and the Unsettled Past: The Collision of DNA, Race, and History (2012; with Keith Wailoo and Catherine Lee); and Technicolor: Race, Technology, and Everyday Life (2001; with Thuy Linh Tu). Before joining the Biden Administration, Nelson was co-chair of the National Academy of Medicine Committee on Emerging Science, Technology, and Innovation and was a member of the National Academy of Engineering Committee on Responsible Computing Research. She served as a past president of the Social Science Research Council, an international research nonprofit, and was previously the inaugural Dean of Social Science at Columbia University. Dr. Nelson began her academic career on the faculty of Yale University, and there was recognized with the Poorvu Prize for interdisciplinary teaching excellence. Dr. Nelson is an elected member of the National Academy of Medicine, the American Academy of Arts and Sciences, the American Philosophical Society, the American Association for the Advancement of Science, and the American Academy of Political and Social Science.

Dec. 1, 2022, 7:30 a.m.

NeurIPS has been in existence for more than 3 decades, each one marked by a dominant trend. The pioneering years saw the burgeoning of back-prop nets, the coming-of-age years blossomed with convex optimization, regularization, Bayesian methods, boosting, kernel methods, to name a few, and the junior years have been dominated by deep nets and big data. And now, recent analyses conclude that using ever bigger data and deeper networks is not a sustainable way of progressing. Meanwhile, other indicators show that Machine Learning is increasingly reliant upon good data and benchmarks, not only to train more powerful and/or more compact models, but also to soundly evaluate new ideas and to stress test models on their reliability, fairness, and protection against various attacks, including privacy attacks.

Simultaneously, in 2021, the NeurIPS Dataset and Benchmark track was launched and the Data-Centric AI initiative was born. This kickstarted the "data-centric era". It is gaining momentum in response to the new needs of data scientists who, admittedly, spend more time on understanding problems, designing experimental settings, and engineering datasets, than on designing and training ML models.

We will retrace the enormous collective efforts made by our community since the 1980's to share datasets and benchmarks, putting forward important milestones that led us to today's effervescence. We will pick a few hot topics that have raised controversy and have engendered novel thought-provoking contributions. Finally, we will highlight some of the most pressing issues that must be addressed by the community.


Isabelle Guyon

Isabelle Guyon recently joined Google Brain as a research scientist. She is also professor of artificial intelligence at Université Paris-Saclay (Orsay). Her areas of expertise include computer vision, bioinformatics, and power systems. She is best known for being a co-inventor of Support Vector Machines. Her recent interests are in automated machine learning, meta-learning, and data-centric AI. She has been a strong promoter of challenges and benchmarks, and is president of ChaLearn, a non-profit dedicated to organizing machine learning challenges. She is community lead of Codalab competitions, a challenge platform used both in academia and industry. She co-organized the “Challenges in Machine Learning Workshop” @ NeurIPS between 2014 and 2019, launched the "NeurIPS challenge track" in 2017 while she was general chair, and pushed the creation of the "NeurIPS datasets and benchmark track" in 2021, as a NeurIPS board member.

Dec. 1, 2022, 12:30 p.m.

I will describe a training algorithm for deep neural networks that does not require the neurons to propagate derivatives or remember neural activities. The algorithm can learn multi-level representations of streaming sensory data on the fly without interrupting the processing of the input stream. The algorithm scales much better than reinforcement learning and would be much easier to implement in cortex than backpropagation.


Geoffrey Hinton

Dec. 2, 2022, 6:15 a.m.

Title: Differentially Private Learning with Margin Guarantees

Abstract:

Preserving privacy is a crucial objective for machine learning algorithms. But, despite the remarkable theoretical and algorithmic progress in differential privacy over the last decade or more, its application to learning still faces several obstacles.

A recent series of publications have shown that differentially private PAC learning of infinite hypothesis sets is not possible, even for common hypothesis sets such as that of linear functions. Another rich body of literature has studied differentially private empirical risk minimization in a constrained optimization setting and shown that the guarantees are necessarily dimension-dependent. In the unconstrained setting, dimension-independent bounds have been given, but they admit a dependency on the norm of a vector that can be extremely large, which makes them uninformative.

These results raise some fundamental questions about private learning with common high-dimensional problems: is differentially private learning with favorable (dimension-independent) guarantees possible for standard hypothesis sets?

This talk presents a series of new differentially private algorithms for learning linear classifiers, kernel classifiers, and neural-network classifiers with dimension-independent, confidence-margin guarantees.

Joint work with Raef Bassily and Ananda Theertha Suresh.


Mehryar Mohri

Mehryar Mohri is a Professor of Computer Science and Mathematics at the Courant Institute of Mathematical Sciences and a Research Consultant at Google. Prior to these positions, he spent about ten years at AT&T Bell Labs, later AT&T Labs-Research, where he served for several years as a Department Head and a Technology Leader.

His research interests cover a number of different areas: primarily machine learning, algorithms and theory, automata theory, speech processing, natural language processing, and also computational biology. His research in learning theory and algorithms has been used in a variety of applications. His work on automata theory and algorithms has served as the foundation for several applications in language processing, with several of his algorithms used in virtually all spoken-dialog and speech recognitions systems used in the United States.

He has co-authored several software libraries widely used in research and academic labs. He is also co-author of the machine learning textbook Foundations of Machine Learning used in graduate courses on machine learning in several universities and corporate research laboratories.

Dec. 2, 2022, 6:15 a.m.


Dec. 2, 2022, 6:35 a.m.

Advances in machine learning have led to rapid and widespread deployment of learning-based inference and decision-making for safety-critical applications, such as autonomous driving and security diagnostics. Current machine learning systems, however, assume that training and test data follow the same, or similar, distributions, and do not consider active adversaries manipulating either distribution. Recent work has demonstrated that motivated adversaries can circumvent anomaly detection or other machine learning models at test time through evasion attacks, or can inject well-crafted malicious instances into training data to induce errors in inference time through poisoning attacks, especially in the distributed learning setting. In this talk, I will describe my recent research about security and privacy problems in federated learning, with a focus on potential certifiable defense approaches, differentially private federated learning, and fairness in FL. We will also discuss other defense principles towards developing practical robust learning systems with trustworthiness guarantees.


Bo Li

Dec. 2, 2022, 6:40 a.m.


Siddharth Karamcheti

Dec. 2, 2022, 6:45 a.m.


Dec. 2, 2022, 6:57 a.m.

In this talk, I will cover the recent advances in the study of asynchronous stochastic gradient descent (SGD). Previously, it was repeatedly stated in theoretical papers that the performance of Asynchronous SGD degrades dramatically when any delay is large, giving the impression that performance depends primarily on the delay. On the contrary, we prove much better guarantees for the same Asynchronous SGD algorithm regardless of the delays in the gradients, depending instead just on the number of parallel devices used to implement the algorithm. Our guarantees are strictly better than the existing analyses, and we also argue that asynchronous SGD outperforms synchronous minibatch SGD in the settings we consider. For our analysis, we introduce a novel recursion based on "virtual iterates" and delay-adaptive stepsizes, which allow us to derive state-of-the-art guarantees for both convex and non-convex objectives.


Dec. 2, 2022, 7 a.m.


Kelsey Allen

Dec. 2, 2022, 7 a.m.


Jens Kober

Dec. 2, 2022, 7:05 a.m.

What we pay attention to depends on the context and the task at hand. On the one hand, the prefrontal cortex can modulate how to direct attention outward to the external world. On the other hand, attention to internal states enables metacognition and configuration of internal states using repertoires of memories and skills. I will first discuss ongoing work in which, inspired by the role of attention in affordances and task-sets, we analyze large scale game play data in the XboX 3D game Bleeding Edge in an interpretable way. I will briefly mention ongoing directions including decoding of plans during chess based on eye-tracking. I will conclude with how future models of multi-scale predictive representations could include prefrontal cortical modulation during planning and task performance.


Ida Momennejad

Invited talk: Invited talk: Mengye Ren

Dec. 2, 2022, 7:10 a.m.


Dec. 2, 2022, 7:10 a.m.


Dec. 2, 2022, 7:15 a.m.


Invited Talk: Invited Talk: Chen Yan

Dec. 2, 2022, 7:20 a.m.


Chen Yan

Dec. 2, 2022, 7:25 a.m.

Many deep neural network architectures loosely based on brain networks have recently been shown to replicate neural firing patterns observed in the brain. One of the most exciting and promising novel architectures, the Transformer neural network, was developed without the brain in mind. In this work, we show that transformers, when equipped with recurrent position encodings, replicate the precisely tuned spatial representations of the hippocampal formation; most notably place and grid cells. Furthermore, we show that this result is no surprise since it is closely related to current hippocampal models from neuroscience. We additionally show the transformer version offers dramatic performance gains over the neuroscience version. This work continues to bind computations of artificial and brain networks, offers a novel understanding of the hippocampal-cortical interaction, and suggests how wider cortical areas may perform complex tasks beyond current neuroscience models such as language comprehension.


James Whittington

Dec. 2, 2022, 7:30 a.m.


Azalia Mirhoseini

Dec. 2, 2022, 7:30 a.m.


Willie Neiswanger

Dec. 2, 2022, 7:30 a.m.


Danica Kragic

Dec. 2, 2022, 7:35 a.m.


Dec. 2, 2022, 7:40 a.m.


Dec. 2, 2022, 7:45 a.m.

In robotics, human-robot collaboration works best when robots are responsive to their human partners’ mental states. Human eye gaze has been used as a proxy for one such mental state: attention. While eye gaze can be a useful signal, for example enabling intent prediction, it is also a noisy one. Gaze serves several functions beyond attention, and thus recognizing what people are attending to from their eye gaze is a complex task. In this talk, I will discuss our research on modeling eye gaze to understand human attention in collaborative tasks such as shared manipulation and assisted driving.


Henny Admoni

Invited Talk: Invited Talk 1

Dec. 2, 2022, 8 a.m.


Max A Wiesner

Dec. 2, 2022, 8 a.m.


Marta Blangiardo

Dec. 2, 2022, 8:05 a.m.

When people make sense of the world, they don't only pay attention to what's actually happening. Their mind also takes them to counterfactual worlds of what could have happened. In this talk, I will illustrate how we can use eye-tracking to uncover the human mind's forays into the imaginary. I will show that when people make causal judgments about physical interactions, they don't just look at what actually happens. They mentally simulate what would have happened in relevant counterfactual situations to assess whether the cause made a difference. And when people try to figure out what happened in the past, they mentally simulate the different scenarios that could have led to the outcome. Together these studies illustrate how attention is not only driven by what's out there in the world, but also by what's hidden inside the mind.


Tobias Gerstenberg

Dec. 2, 2022, 8:15 a.m.


Dec. 2, 2022, 8:30 a.m.


Igor Mordatch

Dec. 2, 2022, 9 a.m.

I will discuss two ideas: (1) virtual laboratories for science and R&D, aiming to introduce an interface between algorithms and domain research that enables AI-driven scale advantages, and (2) AI-based ‘sidekick’ assistants. The purpose of the assistants is to help other agents reach their goals, even when they are not yet able to specify the goal explicitly or it is evolving. Such assistants can help with prior knowledge elicitation, at the simplest, and zero-shot assistance as the worst case. Ultimately they should be helpful for human domain experts in running experiments and solving research problems in virtual laboratories. I invite researchers to join the virtual laboratory movement: domain scientists by hosting a virtual laboratory in their field, methods researchers by contributing new methods to virtual laboratories, and human-in-the-loop ML researchers by developing the assistants.


Dec. 2, 2022, 9 a.m.


Slava Borovitskiy

Dec. 2, 2022, 9:10 a.m.


Dec. 2, 2022, 9:15 a.m.


Noah Goodman

Dec. 2, 2022, 9:20 a.m.

Compared to other machine learning tasks like classification and regression, synthetic data generation is a new area of inquiry for machine learning. One challenge we encountered early on in working with synthetic data was the lack of standardized metrics for evaluating it. Although evaluation for tabular synthetic data is less subjective than for ML-generated images or natural language, it comes with its own specific considerations. For instance, metrics must take into account what the data is being generated for, as well as tradeoffs between quality, privacy, and utility that are inherent to this type of data.

To begin addressing this need, we created an open source library called SDMetrics, which contains a number of synthetic data evaluation tools. We identified inherent hierarchies that exist in these evaluations — for example, columnwise comparison vs. correlation matrix comparison — and built ways to test and validate these metrics. The library also provides user-friendly, focused reports and mechanisms to prevent "metric fatigue.”


Kalyan Veeramachaneni

Dec. 2, 2022, 9:30 a.m.


Dec. 2, 2022, 9:55 a.m.


Dec. 2, 2022, 10 a.m.


Dec. 2, 2022, 11 a.m.


Francesco Di Giovanni

Francesco Di Giovanni

Dec. 2, 2022, 11:05 a.m.


Stefanie Tellex

Dec. 2, 2022, 11:30 a.m.


Matej Balog

Matej Balog

Invited Talk: Invited Talk: Yejin Choi

Dec. 2, 2022, 11:30 a.m.


Dec. 2, 2022, 11:30 a.m.


Dan Bohus

Dec. 2, 2022, 11:30 a.m.


Jasper Snoek

Jasper Snoek is a research scientist at Google Brain. His research has touched a variety of topics at the intersection of Bayesian methods and deep learning. He completed his PhD in machine learning at the University of Toronto. He subsequently held postdoctoral fellowships at the University of Toronto, under Geoffrey Hinton and Ruslan Salakhutdinov, and at the Harvard Center for Research on Computation and Society, under Ryan Adams. Jasper co-founded a Bayesian optimization focused startup, Whetlab, which was acquired by Twitter. He has served as an Area Chair for NeurIPS, ICML, AISTATS and ICLR, and organized a variety of workshops at ICML and NeurIPS.

Dec. 2, 2022, noon


Brenna Argall

Dec. 2, 2022, noon

Unconstrained eye gaze estimation using ordinary webcams in smart phones and tablets is immensely useful for many applications. However, current eye gaze estimators are limited in their ability to generalize to a wide range of unconstrained conditions, including, head poses, eye gaze angles and lighting conditions, etc. This is mainly due to the lack of availability of gaze training data in in-the-wild conditions. Notably, eye gaze is a natural form of human communication while humans interact with each other. Visual data (videos or images) containing human interaction are also abundantly available on the internet and are constantly growing as people upload more. Could we leverage visual data containing human interaction to learn unconstrained gaze estimators? In this talk we will describe our foray into addressing this challenging problem. Our findings point to the great potential of human interaction data as a low cost and ubiquitously available source of training data for unconstrained gaze estimators. By lessening the burden of specialized data collection and annotation, we hope to foster greater real-word adoption and proliferation of gaze estimation technology in end-user devices.


Shalini De Mello

Shalini De Mello is a Principal Research Scientist and Research Lead in the Learning and Perception Research group at NVIDIA, which she joined in 2013. Her research interests are in human-centric vision (face and gaze analysis) and in data-efficient (synth2real, low-shot, self-supervised and multimodal) machine learning. She has co-authored 48 peer-reviewed publications and holds 38 patents. Her inventions have contributed to several NVIDIA products, including DriveIX and Maxine. Previously, she has worked at Texas Instruments and AT&T Laboratories. She received her Doctoral degree in Electrical and Computer Engineering from the University of Texas at Austin.

Dec. 2, 2022, noon


Xavier Bresson

Dec. 2, 2022, noon


Dec. 2, 2022, 12:10 p.m.

Progress, achievements and challenges in synthetic data.


Dimitris Vlitas

Dec. 2, 2022, 12:20 p.m.

Humans and many other animals have an enormous capacity to learn about sensory stimuli and to master new skills. Many of the mechanisms that enable us to learn remain to be understood. One of the greatest challenges of systems neuroscience is to explain how synaptic connections change to support maximally adaptive behaviour. We will provide an overview of factors that determine the change in the strength of synapses. Specifically, we will discuss the influence of attention, neuromodulators and feedback connections in synaptic plasticity and suggest a specific framework, called BrainProp, in which these factors interact to improve the functioning of the entire network.

Much recent work focuses on learning in the brain using presumed biologically plausible variants of supervised learning algorithms. However, the biological plausibility of these approaches is limited, because there is no teacher in the motor cortex that instructs the motor neurons. Instead, learning in the brain usually depends on reward and punishment. BrainProp is a biologically plausible reinforcement learning scheme for deep networks with an any number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in lower layers that are responsible for this action. After the choice, the network receives reinforcement so that there is no need for a teacher. We showed how BrainProp is mathematically equivalent to error backpropagation, for one output unit at a time (Pozzi et al., 2020). We illustrate learning of classical and hard image-classification benchmarks (MNIST, CIFAR10, CIFAR100 and Tiny ImageNet) by deep networks. BrainProp achieves an accuracy that is equivalent to that of standard error-backpropagation, and better than other state-of-the-art biologically inspired learning schemes. Additionally, the trial-and-error nature of learning is associated with limited additional training time so that BrainProp is a factor of 1-3.5 times slower. These results provide new insights into how deep learning may be implemented in the brain.


Pieter Roelfsema

invited talk: Invited talk: Greg Yang

Dec. 2, 2022, 12:30 p.m.


Dec. 2, 2022, 12:35 p.m.

Synthetic medical data – needed to bring ML in medicine up to speed?

To bring medical AI to the next level, in terms of exceeding state-of-art and not just developing but also implementing novel algorithms as decision support tools into clinical practice, we need to bring data scientists, laboratory scientists and physician (researchers) together. To succeed with this, we need to bring medical data out in the open, to benefit from the best possible models being developed jointly by the data science community. Solving the data privacy issues – in terms of creating synthetic data sets reflecting the correlations of the original data sets without jeopardizing data privacy – is needed. Here you get the perspectives on real clinical challenges, real data science approaches being implemented into clinical practice, regulatory issues and real unmet needs of synthetic data sets from the perspective of a physician researcher.


Carsten Utoft Niemann

Hematologist, chief physician and associate professor at Copenhagen University with +10 years hands-on experience in hematology research laboratories across Europe and US. Chairing the Nordic CLL Study Group and heading the CLL research laboratory and clinical research program at Rigshospitalet, Copenhagen, Denmark. Combining genetic and functional characterization of CLL and microenvironmental cells, MRD testing and diagnostic work up for CLL based on the EuroMRD and ERIC collaborations, along with epidemiological studies. Impacting current practice in CLL, co-authoring ESMO guidelines, TP53 guidelines, flowcytometry- and MRD-guidelines for CLL in addition to 100 peer-reviewed publications. Founding member for the clinical GAIA/CLL13, VISION/HO141, HO158, HO159 and CLL17 trials, thus leading the way for testing targeted therapy in CLL. Founder of the PreVent-ACaLL phase 2-3 trial, the first Machine Learning based clinical trial in CLL. Thus, applying a uniquely strong background across all the disciplines needed for translational research in CLL feeding into extended decision matrices for lymphoid malignancies based on medical Artificial Intelligence for precision medicine.

Dec. 2, 2022, 12:40 p.m.

Attention in psychology and neuroscience conceptualizes how the human mind prioritizes information as a result of limited resources. Machine learning systems do not necessarily share the same limits, but implementations of attention have nevertheless proven useful in machine learning across a broad set of domains. Why is this so? I will focus on one aspect: interpretability, which is an ongoing challenge for machine learning systems. I will discuss two different implementations of attention in machine learning that tie closely to conceptualizations of attention in two domains of psychological research. Using these case studies as a starting point, I will discuss the broader strengths and drawbacks of using attention to constrain and interpret how machine learning systems process information. I will end with a problem statement highlighting the need to move away from localized notions to a global view of how attention-like mechanisms modulate information processing in artificial systems.


Dec. 2, 2022, 12:45 p.m.


Paula Moraga

Dec. 2, 2022, 1 p.m.

Attention and eye movements are thought to be a window to the human mind, and have been extensively studied across Neuroscience, Psychology and HCI. However, progress in this area has been severely limited as the underlying methodology relies on specialized hardware that is expensive (upto $30,000) and hard to scale. In this talk, I will present our recent work from Google, which shows that ML applied to smartphone selfie cameras can enable accurate gaze estimation, comparable to state-of-the-art hardware based devices, at 1/100th the cost and without any additional hardware. Via extensive experiments, we show that our smartphone gaze tech can successfully replicate key findings from prior hardware-based eye movement research in Neuroscience and Psychology, across a variety of tasks including traditional oculomotor tasks, saliency analyses on natural images and reading comprehension. We also show that smartphone gaze could enable applications in improved health/wellness, for example, as a potential digital biomarker for detecting mental fatigue. These results show that smartphone-based attention has the potential to unlock advances by scaling eye movement research, and enabling new applications for improved health, wellness and accessibility, such as gaze-based interaction for patients with ALS/stroke that cannot otherwise interact with devices.


Vidhya Navalpakkam

I am currently a Principal Scientist at Google Research. I lead an interdisciplinary team at the intersection of Machine learning, Neuroscience, Cognitive Psychology and Vision. My interests are in modeling user attention and behavior across multimodal interfaces, for improved usability and accessibility of Google products. I am also interested in applications of attention for healthcare (e.g., smartphone-based screening for health conditions).

Before joining Google in 2012, I was at Yahoo Research. Prior to joining the industry in 2010, I worked on modeling attention mechanisms in the brain during my postdoc at Caltech (working with Drs. Christof Koch, Pietro Perona and Antonio Rangel) and PhD at USC (working with Dr. Laurent Itti). I have a Bachelors in Computer Science from the Indian Institute of Technology, Kharagpur.

Dec. 2, 2022, 1 p.m.


Dec. 2, 2022, 1:10 p.m.


Dec. 2, 2022, 1:30 p.m.

Title: The Fifth Paradigm of Scientific Discovery

Abstract: I will argue that we may be at the beginning of a new paradigm of scientific discovery based on deep learning combined with ab initio simulation of physical processes. We envision a system where simulations generate data to train neural surrogate models that in turn will accelerate simulations. The result will be an active learning framework where accurate data is acquired when the surrogate model is uncertain about it’s predictions. We will argue this hybrid approach can accelerate scientific discovery, for instance the the search for new drugs, and materials.


Max Welling

Dec. 2, 2022, 1:30 p.m.

Carlos Quintero Pena


Dec. 2, 2022, 1:30 p.m.

Existing theory predicts that data heterogeneity will degrade the performance of the Federated Averaging (FedAvg) algorithm. However, in practice, the simple FedAvg algorithm converges very well. In this talk, we explain the seemingly unreasonable effectiveness of FedAvg that contradicts the previous theoretical predictions. We find that the key assumption of bounded gradient dissimilarity in previous theoretical analyses is too pessimistic to characterize data heterogeneity in practical applications. For a simple quadratic problem, we demonstrate there exist regimes where large gradient dissimilarity does not have any negative impact on the convergence of FedAvg. Motivated by this observation, we propose a new quantity average drift at optimum to measure the effects of data heterogeneity and explicitly use it to present a new theoretical analysis of FedAvg. We show that the average drift at optimum is nearly zero across many real-world federated training tasks, whereas the gradient dissimilarity can be large. And our new analysis suggests FedAvg can have identical convergence rates in homogeneous and heterogeneous data settings, and hence, leads to a better understanding of its empirical success.


Jianyu Wang

Dec. 2, 2022, 1:35 p.m.


Igor Mordatch

Dec. 2, 2022, 1:45 p.m.


Dec. 2, 2022, 1:52 p.m.

Vertical Federated Learning (VFL) algorithms are an important class of federated learning algorithms in which parties’ local datasets share a common sample ID space but have different feature sets. This is in contrast to Horizontal Federated Learning (HFL), where parties share the same feature sets but for different sample IDs. While much work has been done to advance the efficiency and flexibility of HFL, these techniques do not directly extend to VFL due to differences in the model architecture and training paradigm. In this talk, I will present two methods for efficient and robust VFL. The first, Compressed VFL, reduces communication cost through message compression while achieving the same asymptotic convergence rate as standard VFL with no compression. The second, Flex-VFL, extends VFL to support heterogeneous parties that may use different local optimizers and may operate at different rates. I will highlight some interesting theoretical and experimental results for each method, and finally, I will present some directions and open questions for future work in VFL.


Stacy Patterson

Dec. 2, 2022, 1:55 p.m.

Progress, achievements, and challenges in synthetic data.


Tucker Balch

Zhaozhi Qian

Dec. 2, 2022, 2:05 p.m.


Invited Talk: Invited Talk: He He

Dec. 2, 2022, 2:15 p.m.


Dec. 2, 2022, 2:15 p.m.


James McClelland

Dec. 2, 2022, 2:20 p.m.

Privacy-Preserving Data Synthesis for General Purposes

The recent success of deep neural networks (DNNs) hinges on the availability of large-scale datasets; however, training on such datasets often poses privacy risks for sensitive training information, such as face images and medical records of individuals. In this talk, I will mainly discuss how to explore the power of generative models and gradient sparsity, and talk about different scalable privacy-preserving generative models in both centralized and decentralized settings. In particular, I will introduce our recent work on large-scale privacy-preserving data generative models leveraging gradient compression with convergence guarantees. I will also introduce how to train generative models with privacy guarantees in heterogeneous environments, where data of local agents come from diverse distributions. We will finally discuss some potential applications for different privacy-preserving data synthesis strategies.


Bo Li

Dec. 2, 2022, 2:30 p.m.


Dec. 2, 2022, 2:45 p.m.


Dec. 3, 2022, 6:30 a.m.


Dec. 3, 2022, 6:30 a.m.

The claim that DNNs and brains represent information in similar ways is largely based on the good performance of DNNs on various brain benchmarks. On this approach, the better DNNs can predict neural activity, the better the correspondence between DNNs and brains. But this is at odds with the standard scientific research approach that is characterized by varying independent variables to test specific hypotheses regarding the causal mechanisms that underlie some phenomenon; models are supported to the extent that they account for these experimental results. The best evidence for a model is that it survives “severe” tests, namely, experiments that have a high probability of falsifying a model if and only if the model is false in some relevant manner. When DNNs are assessed in this way, they catastrophically fail. The field needs to change its methods and put far more weight into falsification to get a better characterization of DNN-brain correspondences and to build more human-like AI.


Jeffrey Bowers

Dec. 3, 2022, 6:40 a.m.

Biological systems must selectively encode partial information about the environment, as dictated by the capacity constraints at work in all living organisms. For example, we cannot see every feature of the light field that reaches our eyes; temporal resolution is limited by transmission noise and delays, and spatial resolution is limited by the finite number of photoreceptors and output cells in the retina. Classical efficient coding theory describes how sensory systems can maximize information transmission given such capacity constraints, but it treats all input features equally. Not all inputs are, however, of equal value to the organism. Our work quantifies whether and how the brain selectively encodes stimulus features, specifically predictive features, that are most useful for fast and effective movements. We have shown that efficient predictive computation starts at the earliest stages of the visual system, in the retina. We borrow techniques from statistical physics and information theory to assess how we get terrific, predictive vision from these imperfect (lagged and noisy) component parts. In broader terms, we aim to build a more complete theory of efficient encoding in the brain, and along the way have found some intriguing connections between formal notions of coarse graining in biology and physics.


Stephanie Palmer

Stephanie Palmer is an Associate Professor in the Department of Organismal Biology and Anatomy and in the Department of Physics at the University of Chicago. She has a PhD in theoretical physics from Oxford University where she was a Rhodes Scholar, and works on questions at the interface of neuroscience and statistical physics. Her recent work explores the question of how the visual system processes incoming information to make fast and accurate predictions about the future positions of moving objects in the environment. She was named an Alfred P. Sloan Foundation Fellow and holds a CAREER award from the NSF. Starting during her undergraduate years at Michigan State University, Stephanie has been teaching chemistry, physics, math, and biology to a wide range of students. At the University of Chicago, she founded and runs the Brains! Program, which brings local middle school kids from the South Side of Chicago to her lab to learn hands-on neuroscience.

Dec. 3, 2022, 7 a.m.

In science, theories are essential for encapsulating knowledge obtained from data, making predictions, and building models that make simulations and technological applications possible. Neuroscience -- along with cognitive science -- however, is a young field with fewer established theories (than, say, physics). One consequence of this fact is that new practitioners in the field sometimes find it difficult to know what makes a good theory. Moreover, the use of conceptual theories and models in the field has endured some criticisms: theories have low quantitative prediction power; models have weak transparency; etc. Addressing these issues calls for identifying the elements of theory in neuroscience. In this talk I will try to present and discuss, with case studies, the following: (1) taxonomies by which the different dimensions of a theory can be assessed. (2) criteria for the goodness of a theory. (3 )trade-offs between agreement with the natural world and representational consistency in the theory/model world.


Lawrence Udeigwe

Dr. Lawrence Udeigwe is an Associate Professor of Mathematics at Manhattan College and a 2021/22 MLK Visiting Associate Professor in Brain and Cognitive Sciences at MIT. His research interests include: use of differential equations to understand the dynamical interaction between Hebbian plasticity and homeostatic plasticity; use of artificial neural networks (ANN) to investigate the mechanisms behind surround suppression and other vision normalization processes; and exploring the practical and philosophical implications of the use of theory in neuroscience. Dr. Udeigwe obtained his PhD from the University of Pittsburgh in 2014 under the supervision of Bard Ermentrout and Paul Munro.

Dec. 3, 2022, 7 a.m.


Irina Higgins

Dec. 3, 2022, 7 a.m.


Cezary Kaliszyk

Dec. 3, 2022, 7:05 a.m.

A desirable feature of interactive NLP systems is the ability to receive feedback from humans and personalize to new users. Existing paradigms encounter challenges in acquiring new concepts due to the use of discrete labels and scalar rewards. As one solution to alleviate this problem, I will present our work on Semantic Supervision (SemSUP), which trains models to predict over multiple natural language descriptions of classes (or even structured ones like JSON). SemSUP can seamlessly replace any standard supervised learning setup without sacrificing any in-distribution accuracy, while providing generalization to unseen concepts and scalability to large label spaces.


Dec. 3, 2022, 7:20 a.m.

We contend with conflicting objectives when interacting with their environment e.g., exploratory drives when the environment is unknown or exploitative to maximise some expected return. A widely studied proposition for understanding how to appropriately balance between these distinct imperatives is active inference. In this talk, I will introduce active inference – a neuroscience theory – which brings together perception and action under a single objective of minimising surprisal across time. Through T-maze simulations, I will illustrate how this single objective provides a way to balance information-based exploration and exploitation. Next, I will present our work on scaling up active inference to operate in complex, continuous state-spaces. For this, we propose using multiple forms of Monte-Carlo (MC) sampling to render (expected) surprisal computationally tractable. I will construct-validate this in a complex Animal-AI environment, where our agents can simulate the future, to evince reward-directed navigation – despite a temporary suspension of visual input. Lastly, I will extend this formulation to appropriately deal with volatile environments by introducing a preference-augmented (expected) surprisal objective. Using the FrozenLake environment, I will discuss different ways of encoding preferences and how they underwrite appropriate levels of arbitration between exploitation and exploration.


Noor Sajid

Dec. 3, 2022, 7:30 a.m.


Taco Cohen

Taco Cohen is a machine learning research scientist at Qualcomm AI Research in Amsterdam and a PhD student at the University of Amsterdam, supervised by prof. Max Welling. He was a co-founder of Scyfer, a company focussed on active deep learning, acquired by Qualcomm in 2017. He holds a BSc in theoretical computer science from Utrecht University and a MSc in artificial intelligence from the University of Amsterdam (both cum laude). His research is focussed on understanding and improving deep representation learning, in particular learning of equivariant and disentangled representations, data-efficient deep learning, learning on non-Euclidean domains, and applications of group representation theory and non-commutative harmonic analysis, as well as deep learning based source compression. He has done internships at Google Deepmind (working with Geoff Hinton) and OpenAI. He received the 2014 University of Amsterdam thesis prize, a Google PhD Fellowship, ICLR 2018 best paper award for “Spherical CNNs”, and was named one of 35 innovators under 35 in Europe by MIT in 2018.

Dec. 3, 2022, 7:30 a.m.


Behnam Neyshabur

Dec. 3, 2022, 7:30 a.m.


Dec. 3, 2022, 7:35 a.m.


Markus Reichstein

Invited Talk: John Langford

Dec. 3, 2022, 7:35 a.m.


John Langford

Dec. 3, 2022, 8 a.m.


Dec. 3, 2022, 8 a.m.

In 2021, we commissioned forecasters to predict progress on ML benchmarks, including the MATH dataset for mathematical problem-solving. Progress on MATH ended up being much faster than predicted. I'll discuss what we should and shouldn't take away from this, my own predictions for future progress, and general implications for predicting future developments in ML.


Jacob Steinhardt

Dec. 3, 2022, 8:30 a.m.


Pradeep Ravikumar

Dec. 3, 2022, 9:05 a.m.


Dec. 3, 2022, 9:10 a.m.

The publication of Shannon’s ‘A Mathematical Theory of Communication’ (1948) has been described as “delayed-action bomb”. It reshaped psychology and neuroscience and has been credited as foundational to the field of cognitive science. Yet after the initial shockwave, the pace of new ideas in cognitive science emerging from the theory slowed dramatically. This trend has begun to reverse, as evidenced by this workshop. But what accounts for the stagnation, and what accounts for the recent change? I argue that information is not enough. Information is a resource or a constraint, but is not sufficient as a computational theory of intelligence. An important step forward for cognitive science came from the combination of information theory with expected utility theory (rate-distortion theory). More recent progress has been driven by the advent of principled approximation methods in computation. The combination of all of these ideas yields ‘information-theoretic computational rationality’, a powerful framework for understanding natural intelligence.


Invited Talk: Qian Yang

Dec. 3, 2022, 9:35 a.m.


Qian Yang

Dec. 3, 2022, 11 a.m.

Adversarial examples have been recognized as a threat, and still pose problems, as it is hard to defend them. Naturally, one might be tempted to think that an image looking like a panda and being classified as a gibbon might be unusual-or at least unusual enough to be discovered by for example Bayesian uncertainty measures. Alas, it turns out that also Bayesian confidence and uncertainty measures are easy to fool when the optimization procedure is adapted accordingly. Moreover, adversarial examples transfer between different methods, so they can also be attacked in a black box setting. To conclude the talk, we will discuss briefly the practical necessity to defend evasion, and what is needed to not only evaluate defenses properly, but also build practical defenses.


Kathrin Grosse

Dec. 3, 2022, 11:30 a.m.

Since Shannon originally proposed his mathematical theory of communication in the middle of the 20th century, information theory has been an important way of viewing and investigating problems at the interfaces between linguistics, cognitive science, and computation, respectively. With the upsurge in applying machine learning approaches to linguistics questions, information-theoretic methods are becoming an ever more important tool in the linguist’s toolbox. This talk focuses on three concrete applications of information-theoretic techniques to the study of the lexicon. In the first part of the talk, I take a coding-theoretic view of the lexicon. Using a novel generative statistical model, I discuss how to estimate the compressibility of the lexicon under various linguistic constraints. In the second part of the talk, I will discuss a longstanding debate in semiotics: How arbitrary is the relationship between a word's form and its meaning? Using mutual information, I give the first holistic quantification of form--meaning arbitrariness, and, in a 106-language study, we do indeed find a statistically significant relationship between a word's form and its meaning in many languages. Finally, in the third part of the talk, I will focus on whether there exists a pressure for or against homophony in the lexicons of the world. On one hand, Piantadosi et al. (2012) argue that homophony enables the reuse of efficient word forms and is thus beneficial for languages. However, on the other hand, Trott and Bergen (2020) posit that good word forms are more often homophonous simply because they are more phonotactically probable. I will discuss a new information-theoretic quantification of a language’s homophony: the sample Rényi entropy. Then, I discuss how to use quantification to study homophony and argue that there is no evidence for a pressure either towards or against homophony, a much more nuanced result than either Piantadosi et al.’s or Trott and Bergen’s findings.


Ryan Cotterell

Dec. 3, 2022, 11:30 a.m.

We will re-examine two popular use-cases of Bayesian approaches: model selection, and robustness to distribution shifts.

The marginal likelihood (Bayesian evidence) provides a distinctive approach to resolving foundational scientific questions --- "how can we choose between models that are entirely consistent with any data?" and "how can we learn hyperparameters or correct ground truth constraints, such as intrinsic dimensionalities, or symmetries, if our training loss doesn't select for them?". There are compelling arguments that the marginal likelihood automatically encodes Occam's razor. There are also widespread practical applications, including the variational ELBO for hyperparameter learning. However, we will discuss how the marginal likelihood is answering a fundamentally different question than "will my trained model provide good generalization?". We consider the discrepancies and their significant practical implications in detail, as well as possible resolutions.

Moreover, it is often thought that Bayesian methods, representing epistemic uncertainty, ought to have more reasonable predictive distributions under covariate shift, since these points will be far from our data manifold. However, we were surprised to find that high quality approximate Bayesian inference often leads to significantly decreased generalization performance. To understand these findings, we investigate fundamentally why Bayesian model averaging can deteriorate predictive performance under distribution and covariate shifts, and provide several remedies based on this understanding.


Andrew Gordon Wilson

Dec. 3, 2022, 11:30 a.m.


Kristopher Jensen

Dec. 3, 2022, noon

This talk is concerned with causal representation learning, which aims to reveal the underlying high-level hidden causal variables and their relations. It can be seen as a special case of causal discovery, whose goal is to recover the underlying causal structure or causal model from observational data. The modularity property of a causal system implies properties of minimal changes and independent changes of causal representations, and I will explain how such properties make it possible to recover the underlying causal representations from observational data with identifiability guarantees: under appropriate assumptions, the learned representations are consistent with the underlying causal process. The talk will consider various settings with independent and identically distributed (i.i.d.) data, temporal data, or data with distribution shift as input, and demonstrate when identifiable causal representation learning can benefit from the flexibility of deep learning and when it has to impose parametric assumptions on the causal process.


Kun Zhang

Dec. 3, 2022, noon

Mathematics requires systematic reasoning, namely the step-wise application of knowledge in a sound manner to reach a conclusion. Can language models (LMs) perform this kind of systematic reasoning with knowledge provided to it? Or, even more ambitiously, can LMs reason systematically with their own internal knowledge acquired during pretraining? In this talk, I'll attempt to answer these questions, illustrated with our recent work on using LMs for logical deduction, proof generation, and multistep textual entailment problems. While progress has been made, there is still a way to go. To illustrate this, I'll conclude by posing a (currently unsolved) grand challenge - answering Fermi problems - to the math reasoning community, requiring combining systematic reasoning, mathematics, and world knowledge together.


Dec. 3, 2022, 12:10 p.m.

In the economics literature, rate-distortion theory (under the name “rational inattention”) has been popular as a model of choice that depends only imprecisely on the characteristics of the options available to an individual decision maker (Sims, 2003; Woodford, 2009; Matejka and McKay, 2015; Mackowiak et al., forthcoming). In this theory, the distribution of actions taken in a given objective situation is assumed to be optimal (in the sense of maximizing expected reward), subject to a constraint on the mutual information between the objective state and the action choice. However, the assumption that a mutual-information cost is the only limit on the precision of choice has unappealing implications: for example, that conditional action probabilities should vary discontinuously with the (continuous) objective state if the rewards associated with given actions are a discontinuous function of the state. In the case of strategic interaction between multiple information-constrained decision makers, this can result in a prediction that equilibrium behavior (in which each agent’s behavior is optimally adapted to the others’ patterns of behavior) should vary discontinuously with changes in the objective state, with the discontinuous responses of each agent being justified by the discontinuous responses of the others. In the kind of example discussed, the location of the discontinuity is indeterminate, so that the assumption of mutually well-adapted behavior fails to yield definite predictions (Yang, 2015); moreover, the predicted discontinuity of equilibrium behavior does not seem to be observed in experiments (Heinemann et al., 2004, 2009; Frydman and Nunnari, 2022). We propose an alternative model of imprecise choice, in which each decision maker is modeled using a generalization of the “β-variational autoencoder” of Alemi et al. (2018), which nests the “rationally inattentive” model of choice as a limiting case. In our more general model, there are two distinct “rate-distortion” trade-offs: one between the rate of information transmission and a cross-entropy measure of distortion (as in the β-VAE of Alemi et al.), and another between the rate and the measure of distortion given by the negative of expected reward (as in rational inattention models). The generalization provides a model of how an imprecise classification of decision situations can be learned from a finite training data set, rather than assuming optimization relative to a precisely correct prior distribution; and it predicts only gradual changes in action probabilities in response to changes in the objective state, in line with experimental data.


Michael Woodford

Dec. 3, 2022, 12:30 p.m.


Federico Felici

Federico Felici is a Research Scientist at the Swiss Plasma Center (SPC) at EPFL, Lausanne. He holds an MSc degree in Systems & Control from Delft University of Technology (2005) and a PhD in Plasma Physics from the Swiss Plasma Center at EPFL, Switzerland (2011). He currently leads the research activities at SPC-EPFL in the area of advanced plasma control. His current research interests include all aspects of tokamak plasma control and control-oriented plasma modelling, with a strong focus on model-based approaches for practical implementation of control across various current and future fusion research devices.

Dec. 3, 2022, 12:35 p.m.

In classic instruction following, language like "I'd like the JetBlue flight" maps to actions (e.g., selecting that flight). However, language also conveys information about a user's underlying reward function (e.g., a general preference for JetBlue), which can allow a model to carry out desirable actions in new contexts. In this talk, I'll share a model that infers rewards from language pragmatically: reasoning about how speakers choose utterances not only to elicit desired actions, but also to reveal information about their preferences.


Anca Dragan

Dec. 3, 2022, 1 p.m.


Francois Charton

Dec. 3, 2022, 1 p.m.


Dec. 3, 2022, 1:30 p.m.


Noah Goodman

Invited talk: Hourglass Emergence

Dec. 3, 2022, 1:30 p.m.

I will discuss how error and subjectivity and the universal collective property of biological systems shape computation and micro-macro relationships in information processing systems. I will introduce three principles of collective computation: downward causation through coarse-graining, hourglass emergence, and a preliminary, information theoretic concept called channel switching that my collaborators and I are developing to formalize the transition from micro-macro causality to macro-macro causality.


Jessica Flack

Jessica Flack is a professor at the Santa Fe Institute, Director of SFI's Collective Computation Group and a Chief Editor at Collective Intelligence. Flack considers herself to be a computational Platonist who is interested in the foundations of computation in nature, the origins of biological space and timescales, and micro-macro maps in information processing systems. Flack works across levels of biological organization from cells to populations of neurons to animal societies to markets. Goals of Flack’s research are to discover the computational principles that allow nature to overcome subjectivity due to noisy information processing, reduce uncertainty, and compute robust, ordered states.

Dec. 3, 2022, 1:30 p.m.


Catherine Nakalembe

Hannah Kerner

Dec. 3, 2022, 1:30 p.m.

In this talk, we discuss two failure cases of common practices that are typically believed to improve on vanilla methods: (i) adversarial training can lead to worse robust accuracy than standard training (ii) active learning can lead to a worse classifier than a model trained using uniform samples. In particular, we can prove both mathematically and empirically, that such failures can happen in the small-sample regime. We discuss high-level explanations derived from the theory, that shed light on the causes of these phenomena in practice.


Fanny Yang

Dec. 3, 2022, 1:35 p.m.

There has been an increased interest in developing general-purpose pretrained models across different domains, such as language, vision, and multimodal. This approach is appealing because we can pretrain models on large datasets once, and then adapt them to various tasks using a smaller supervised dataset. Moreover, these models achieve impressive results on a range of benchmarks, often performing better than task-specific models. Finally, this pretraining approach processes the data passively and does not rely on actively interacting with humans. In this talk, I will first discuss what aspects of language children can learn passively and to what extent interacting with others might require developing theory of mind. Next, I discuss the need for better evaluation pipelines to better understand the shortcomings and strengths of pretrained models. In particular, I will talk about: (1) the necessity of directly measuring real-world performance (as opposed to relying on benchmark performance), (2) the importance of strong baselines, and (3) how to design probing dataset to measure certain capabilities of our models. I will focus on commonsense reasoning, verb understanding, and theory of mind as challenging domains for our existing pretrained models.


Dec. 3, 2022, 1:45 p.m.


Erin Hartman

Dec. 3, 2022, 2:10 p.m.


Alicia Beckford Wassink

Alicia Beckford Wassink is a professor in the Department of Linguistics, University of Washington, and directs the Sociolinguistics Laboratory. She currently serves on the executive committees of both the Linguistic Society of America and the American Dialect Society. Wassink's research interests lie in the sociophonetic analysis of the production and perception of time-varying features speech (vowel systems in particular), the linguistic outcomes of interethnic contact, racial bias in automatic speech recognition, and social network modeling. Her work appears in the Wiley Encyclopedia of World Englishes, as well as books on Language and Identity (Edinburgh University Press), African-American Women’s Language (Oxford), Best Practices in Sociophonetics (Routledge), and Language in the Schools (Elsevier). Primary reports of her research have appeared in Speech Communication, the Publications of the American Dialect Society, American Speech, Journal of The Acoustical Society of America, Journal of Phonetics, Language in Society, Language Variation and Change, Journal of English Linguistics, and the International Journal of Speech-Language Pathology.

Invited talk: Invited talk

Dec. 6, 2022, 3:30 a.m.


Andrew Lan

Dec. 6, 2022, 1:30 p.m.


Dec. 6, 2022, 1:35 p.m.


Dec. 6, 2022, 1:40 p.m.


Dec. 6, 2022, 1:45 p.m.


Dec. 6, 2022, 1:50 p.m.


Dec. 6, 2022, 1:55 p.m.


Dec. 9, 2022, 1:05 a.m.


Morine Amutorine

Dec. 9, 2022, 1:40 a.m.


Graham Cormode

Dec. 9, 2022, 5:50 a.m.


Ivana Dusparic

Invited talk: Roy Perlis

Dec. 9, 2022, 6:45 a.m.


Roy Perlis

Dec. 9, 2022, 6:55 a.m.


Alejandro Saucedo

Alejandro is the Chief Scientist at the Institute for Ethical AI & Machine Learning, where he contributes to policy and industry standards on the responsible design, development and operation of AI, including the fields of explainability, GPU acceleration, privacy preserving ML and other key machine learning research areas. Alejandro Saucedo is also the Director of Machine Learning Engineering at Seldon Technologies, where he leads teams of machine learning engineers focused on the scalability and extensibility of machine learning deployment and monitoring products. With over 10 years of software development experience, Alejandro has held technical leadership positions across hyper-growth scale-ups and has a strong track record building cross-functional teams of software engineers. He is currently appointed as governing council Member-at-Large at the Association for Computing Machinery, and is currently the Chairperson of the Kompute GPU Acceleration Committee at the Linux Foundation.

LInkedin: https://linkedin.com/in/axsaucedo Twitter: https://twitter.com/axsaucedo Github: https://github.com/axsaucedo Website: https://ethical.institute/

Dec. 9, 2022, 7 a.m.


Aleksander Madry

Aleksander Madry is the NBX Associate Professor of Computer Science in the MIT EECS Department and a principal investigator in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). He received his PhD from MIT in 2011 and, prior to joining the MIT faculty, he spent some time at Microsoft Research New England and on the faculty of EPFL. Aleksander's research interests span algorithms, continuous optimization, science of deep learning and understanding machine learning from a robustness perspective. His work has been recognized with a number of awards, including an NSF CAREER Award, an Alfred P. Sloan Research Fellowship, an ACM Doctoral Dissertation Award Honorable Mention, and 2018 Presburger Award.

Dec. 9, 2022, 8 a.m.


Invited Talk: Tobias Gerstenberg

Dec. 9, 2022, 8:30 a.m.


Tobias Gerstenberg

Dec. 9, 2022, 9:30 a.m.


Nika Haghtalab

Dec. 9, 2022, 9:30 a.m.


Yu-Xiang Wang

Dec. 9, 2022, 10 a.m.


Invited Talk: Jakob Foerster

Dec. 9, 2022, 10 a.m.


Jakob Foerster

Jakob Foerster received a CIFAR AI chair in 2019 and is starting as an Assistant Professor at the University of Toronto and the Vector Institute in the academic year 20/21. During his PhD at the University of Oxford, he helped bring deep multi-agent reinforcement learning to the forefront of AI research and interned at Google Brain, OpenAI, and DeepMind. He has since been working as a research scientist at Facebook AI Research in California, where he will continue advancing the field up to his move to Toronto. He was the lead organizer of the first Emergent Communication (EmeCom) workshop at NeurIPS in 2017, which he has helped organize ever since.

Invited Talk: Invited Talk: Been Kim

Dec. 9, 2022, 10:30 a.m.


Invited talk: Rosalind Picard

Dec. 9, 2022, 10:30 a.m.


Invited Talk: Invited Talk: Yi Ma

Dec. 9, 2022, noon


Dec. 9, 2022, 12:30 p.m.


Dec. 9, 2022, 1 p.m.


Invited talk: Christopher Burr

Dec. 9, 2022, 1:30 p.m.


Christopher Burr

I am an Ethics Fellow of the Alan Turing Institute's Public Policy Programme.

My research expertise includes trustworthy AI systems, digital mental healthcare, responsible research and innovation, data ethics, and philosophy of cognitive science 🧠. My research has been featured in the Conversation, the Guardian, BBC Radio 4, the New York Times and Vox. I have also advised numerous policy makers and worked with the Ministry of Justice, Office for Artificial Intelligence, Information Commissioner's Office, Centre for Data Ethics and Innovation, the Department for Digital, Culture, Media and Sport, and the Department of Health and Social Care.

But my most important research is the constant learning involved with being a loving father and husband 👨‍👩‍👧! When I'm not figuring out how to do the above, you can find me at the bouldering gym figuring out how to ascend a bouldering problem 🧗

Please see here for a full list of publications and links to articles (with and without paywalls 💰).

Invited Talk: Amy Zhang

Dec. 9, 2022, 2 p.m.


Amy Zhang

Invited Talk: Igor Mordatch

Dec. 9, 2022, 3 p.m.


Igor Mordatch

Invited Talk: Invited Talk