Invited Talks
Though seemingly opposite, doom and optimism regarding generative AI's spectacular rise both center on AGI or even superintelligence as a pivotal moment. But generative AI operates in a distinct manner from human intelligence, and it’s not a less intelligent human on a chip slowly getting smarter anymore than cars were mere horseless carriages. It must be understood on its own terms. And even if Terminator isn’t coming to kill us or superintelligence isn’t racing to save us, generative AI does bring profound challenges, well-beyond usual worries such as employment effects. Technology facilitates progress by transforming the difficult into easy, the rare into ubiquitous, the scarce into abundant, the manual into automated, and the artisan into mass-produced. While potentially positive long-term, these inversions are extremely destabilizing during the transition, shattering the correlations and assumptions of our social order that relied on superseded difficulties as mechanisms of proof, filtering, sorting and signaling. For example, while few would dispute the value of the printing press or books, their introduction led to such destructive upheaval that the resulting religious wars caused proportionally more deaths than all other major wars and pandemics since combined. Historically, a new technology's revolutionary impact comes from making what's already possible and desired cheap, easy, fast, and large-scale, not from outdated or ill-fitting benchmarks that technologists tend to focus on. As such, Artificial Good-Enough Intelligence can unleash chaos and destruction long before, or if ever, AGI is reached. Existing AI is good enough to blur or pulverize our existing mechanisms of proof of accuracy, effort, veracity, authenticity, sincerity, and even humanity. The tumult from such a transition will require extensive technological, regulatory, and societal effort to counter. But the first step to getting started is having the right nightmares.
Zeynep Tufekci
Zeynep Tufekci - Henry G. Bryant Professor of Sociology and Public Affairs, Princeton University; New York Times Columnist. Tufekci examines the interplay of science, technology and society through a sociological framework and complex systems lens, focusing especially on digital, computational, and artificial intelligence technologies. She was a 2022 Pulitzer Prize finalist for commentary on the COVID-19 pandemic. Her book "Twitter and Tear Gas: The Power and Fragility of Networked Protest" examines the dynamics of social movements in the digital age. She is also faculty associate at the Berkman Klein Center for Internet & Society at Harvard University.
Today’s generative AI systems—termed by some researchers as “alien intelligences”—have exceeded human performance on many benchmarks meant to test humanlike cognitive capabilities. However, these systems still struggle in unhumanlike ways on real-world tasks requiring these capabilities. This disconnect may be due in part to neglect in the AI community of well-founded experimental protocols for evaluating cognition. In this talk I will summarize several recommendations on experimental methods from developmental and comparative psychology—fields that study the “alien intelligences” of babies and non-human animals—and demonstrate the application of such methods in two case studies of cognitive abilities in LLMs: analogical reasoning and visual abstraction.
Melanie Mitchell
Melanie Mitchell - Professor Santa Fe Institute. Mitchell's research areas include AI, cognitive science, and complex systems, with focus on conceptual abstraction and analogy-making in humans and AI systems. She authored "Complexity: A Guided Tour," which won the 2010 Phi Beta Kappa Science Book Award, and "Artificial Intelligence: A Guide for Thinking Humans," which was named one of the five best books on AI by both the New York Times and the Wall Street Journal. She received her PhD from the University of Michigan under Douglas Hofstadter, with whom she developed the Copycat cognitive architecture.
During the past 15 years or so, I have worked on a series of seemingly distinct but eventually related problems, including machine learning algorithms, generative modeling with neural networks, machine translation, language modeling, medical imaging, a bit of healthcare, protein modeling and a bit of drug discovery. I chose to work on some of these problems intentionally, while it was pure serendipity that I worked on some others. It was only in hindsight that these seemingly different problems turned out to be closely related to each other from both technical, social and personal perspectives. In this talk, I plan to do my own retrospective on my own choices, be them intentional or not, on these problems and share with you my thoughts what our own discipline, which is sometimes called computer science, data science, machine learning or artificial intelligence, is.
Kyunghyun Cho
Kyunghyun Cho - Glen de Vries Professor of Health Statistics, NYU; Executive Director of Frontier Research, Prescient Design, Genentech Cho's work spans machine learning and natural language processing. He co-developed the Gated Recurrent Unit (GRU) architecture and has contributed to neural machine translation and sequence-to-sequence learning. He is a CIFAR Fellow of Learning in Machines & Brains and received the 2021 Samsung Ho-Am Prize in Engineering. He served as program chair for ICLR 2020, NeurIPS 2022, and ICML 2022.
Scaling laws suggest that “more is more” — brute-force scaling of data and compute leads to stronger AI capabilities. However, despite rapid progress on benchmarks, state-of-the-art models still exhibit "jagged intelligence," indicating that current scaling approaches may have limitations in terms of sustainability and robustness. Additionally, while the volume of papers on arXiv continues to grow rapidly, our scientific understanding of artificial intelligence hasn't kept pace with engineering advances, and the current literature presents seemingly contradictory findings that can be difficult to reconcile. In this talk, I will discuss key insights into the strengths and limitations of LLMs, examine when reinforcement learning succeeds or struggles in reasoning tasks, and explore methods for enhancing reasoning capabilities in smaller language models to help them close the gap against their larger counterparts in specific domains.
Yejin Choi
Yejin Choi - Professor of Computer Science, Stanford University; Dieter Schwarz Foundation Senior Fellow, Stanford HAI; Distinguished Scientist, NVIDIA research focuses on natural language processing, with emphasis on commonsense reasoning and language understanding. She is a 2022 MacArthur Fellow and was named to Time's Most Influential People in AI in 2023. She has received multiple Test of Time Awards from ACL and CVPR, and Best Paper Awards at venues including ACL, EMNLP, ICML, and NeurIPS. She previously held positions at the University of Washington and Allen Institute for AI.
Deep neural networks have revolutionized artificial intelligence, yet their inner workings remain poorly understood. This talk presents mathematical analyses of the nonlinear dynamics of learning in several solvable deep network models, offering theoretical insights into the role of depth. These models reveal how learning algorithms, data structure, initialization schemes, and architectural choices interact to produce hidden representations that afford complex generalization behaviors. A recurring theme across these analyses is a neural race: competing pathways within a deep network vie to explain the data, with an implicit bias toward shared representations. These shared representations in turn shape the network’s capacity for systematic generalization, multitasking, and transfer learning. I will show how such principles manifest across diverse architectures—including feedforward, recurrent, and linear attention networks. Together, these results provide analytic foundations for understanding how environmental statistics, network architecture, and learning dynamics jointly structure the emergence of neural representations and behavior.
Andrew Saxe
Andrew Saxe - Professor of Theoretical Neuroscience & Machine Learning, Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre, UCL. Saxe's research focuses on mathematical theories of learning in neural networks. He has developed exact solutions for learning dynamics in deep linear networks and studies connections between artificial and biological learning systems. He is a CIFAR Fellow of Learning in Machines & Brains and recipient of the 2019 Wellcome Trust Beit Prize. His work includes theoretical analyses of semantic development and the dynamics of representation learning.
As AI has become a huge industry, to an extent it has lost its way. What is needed to get us back on track to true intelligence? We need agents that learn continually. We need world models and planning. We need knowledge that is high-level and learnable. We need to meta-learn how to generalize. The Oak architecture is one answer to all these needs. It is a model-based RL architecture with three special features: 1) all of its components learn continually, 2) each learned weight has a dedicated step-size parameter that is meta-learned using online cross-validation, and 3) abstractions in state and time are continually created in a five-step progression: Feature Construction, posing a SubTask based on the feature, learning an Option to solve the subtask, learning a Model of the option, and Planning using the option’s model (the FC-STOMP progression). The Oak architecture is rather meaty; in this talk we give an outline and point to the many works, prior and contemporaneous, that are contributing to its overall vision of how superintelligence can arise from an agent’s experience.
Rich Sutton
Research Scientist, Keen Technologies; Professor, University of Alberta; Chief Scientific Advisor, Amii; Chief Scientific Officer, ExperienceFlow.ai. Sutton co-developed temporal difference learning and policy gradient methods in reinforcement learning. He received the 2024 Turing Award with Andrew Barto for foundational contributions to reinforcement learning. He is co-author of the textbook "Reinforcement Learning: An Introduction" and is a Fellow of the Royal Society and the Royal Society of Canada. His research focuses on computational principles underlying learning and decision-making.