Virtually all deep learning is built upon the notion of explicit computation: layers of a network are written in terms of their explicit step-by-step computations used to map inputs to outputs. But a rising trend in deep learning takes a different approach: implicit layers, where one instead specifies the conditions for a layer’s output to satisfy. Such architectures date back to early work on recurrent networks but have recently gained a great deal of attention as the approach behind Neural ODEs, Deep Equilibrium Models (DEQs), FFJORD, optimization layers, SVAEs, implicit meta-learning, and many other approaches. These methods can have substantial conceptual, computational, and modeling benefits: they often make it much easier to specify simple-yet-powerful architectures, can vastly reduce the memory consumption of deep networks, and allow more natural modeling of e.g. continuous-time phenomena.
This tutorial will provide a unified perspective on implicit layers, illustrating how the implicit modeling framework encompasses all the models discussed above, and providing a practical view of how to integrate such approaches into modern deep learning systems. We will cover the history and motivation of implicit layers, discuss how to solve the resulting "forward" inference problem, and then highlight how to compute gradients through such layers in the backward pass, via implicit differentiation. Throughout, we will highlight several applications of these methods in Neural ODEs, DEQs, and other settings. The tutorial will be accompanied by an interactive monograph on implicit layers: a set of interactive Colab notebooks with code in both the JAX and PyTorch libraries.