Timezone: »

(Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization
David Duvenaud · J. Zico Kolter · Matthew Johnson

Mon Dec 07 01:30 PM -- 04:00 PM (PST) @

Virtually all deep learning is built upon the notion of explicit computation: layers of a network are written in terms of their explicit step-by-step computations used to map inputs to outputs. But a rising trend in deep learning takes a different approach: implicit layers, where one instead specifies the conditions for a layer’s output to satisfy. Such architectures date back to early work on recurrent networks but have recently gained a great deal of attention as the approach behind Neural ODEs, Deep Equilibrium Models (DEQs), FFJORD, optimization layers, SVAEs, implicit meta-learning, and many other approaches. These methods can have substantial conceptual, computational, and modeling benefits: they often make it much easier to specify simple-yet-powerful architectures, can vastly reduce the memory consumption of deep networks, and allow more natural modeling of e.g. continuous-time phenomena.

This tutorial will provide a unified perspective on implicit layers, illustrating how the implicit modeling framework encompasses all the models discussed above, and providing a practical view of how to integrate such approaches into modern deep learning systems. We will cover the history and motivation of implicit layers, discuss how to solve the resulting "forward" inference problem, and then highlight how to compute gradients through such layers in the backward pass, via implicit differentiation. Throughout, we will highlight several applications of these methods in Neural ODEs, DEQs, and other settings. The tutorial will be accompanied by an interactive monograph on implicit layers: a set of interactive Colab notebooks with code in both the JAX and PyTorch libraries.

Author Information

David Duvenaud (University of Toronto)

David Duvenaud is an assistant professor in computer science at the University of Toronto. His research focuses on continuous-time models, latent-variable models, and deep learning. His postdoc was done at Harvard University, and his Ph.D. at the University of Cambridge. David also co-founded Invenia, an energy forecasting and trading company.

J. Zico Kolter (Carnegie Mellon University / Bosch Center for AI)

Zico Kolter is an Assistant Professor in the School of Computer Science at Carnegie Mellon University, and also serves as Chief Scientist of AI Research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, explainable, and rigorous methods in deep learning. In addition, he has worked on a number of application areas, highlighted by work on sustainability and smart energy systems. He is the recipient of the DARPA Young Faculty Award, and best paper awards at KDD, IJCAI, and PESGM.

Matthew Johnson (Google Brain)

Matt Johnson is a research scientist at Google Brain interested in software systems powering machine learning research. He is the tech lead for JAX, a system for composable function transformations in Python. He was a postdoc at Harvard University with Ryan Adams, working on composing graphical models with neural networks and applications in neurobiology. His Ph.D. is from MIT, where he worked with Alan Willsky on Bayesian nonparametrics, time series models, and scalable inference.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors