Timezone: »
Poster
Efficient and Accurate Gradients for Neural SDEs
Patrick Kidger · James Foster · Xuechen (Chen) Li · Terry Lyons
Neural SDEs combine many of the best qualities of both RNNs and SDEs, and as such are a natural choice for modelling many types of temporal dynamics. They offer memory efficiency, high-capacity function approximation, and strong priors on model space. Neural SDEs may be trained as VAEs or as GANs; in either case it is necessary to backpropagate through the SDE solve. In particular this may be done by constructing a backwards-in-time SDE whose solution is the desired parameter gradients. However, this has previously suffered from severe speed and accuracy issues, due to high computational complexity, numerical errors in the SDE solve, and the cost of reconstructing Brownian motion. Here, we make several technical innovations to overcome these issues. First, we introduce the \textit{reversible Heun method}: a new SDE solver that is algebraically reversible -- which reduces numerical gradient errors to almost zero, improving several test metrics by substantial margins over state-of-the-art. Moreover it requires half as many function evaluations as comparable solvers, giving up to a $1.98\times$ speedup. Next, we introduce the \textit{Brownian interval}. This is a new and computationally efficient way of exactly sampling \textit{and reconstructing} Brownian motion; this is in contrast to previous reconstruction techniques that are both approximate and relatively slow. This gives up to a $10.6\times$ speed improvement over previous techniques. After that, when specifically training Neural SDEs as GANs (Kidger et al. 2021), we demonstrate how SDE-GANs may be trained through careful weight clipping and choice of activation function. This reduces computational cost (giving up to a $1.87\times$ speedup), and removes the truncation errors of the double adjoint required for gradient penalty, substantially improving several test metrics. Altogether these techniques offer substantial improvements over the state-of-the-art, with respect to both training speed and with respect to classification, prediction, and MMD test metrics. We have contributed implementations of all of our techniques to the \texttt{torchsde} library to help facilitate their adoption.
Author Information
Patrick Kidger (University of Oxford)
James Foster (University of Oxford)
Xuechen (Chen) Li (Stanford University)
Terry Lyons (University of Oxford)
More from the Same Authors
-
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2022 : A Closer Look at the Calibration of Differential Private Learners »
Hanlin Zhang · Xuechen (Chen) Li · Prithviraj Sen · Salim Roukos · Tatsunori Hashimoto -
2022 Poster: When Does Differentially Private Learning Not Suffer in High Dimensions? »
Xuechen Li · Daogao Liu · Tatsunori Hashimoto · Huseyin A. Inan · Janardhan Kulkarni · Yin-Tat Lee · Abhradeep Guha Thakurta -
2022 Poster: Positively Weighted Kernel Quadrature via Subsampling »
Satoshi Hayakawa · Harald Oberhauser · Terry Lyons -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 Workshop: The Symbiosis of Deep Learning and Differential Equations »
Luca Celotti · Kelly Buchanan · Jorge Ortiz · Patrick Kidger · Stefano Massaroli · Michael Poli · Lily Hu · Ermal Rrapaj · Martin Magill · Thorsteinn Jonsson · Animesh Garg · Murtadha Aldeer -
2021 : Equinox: neural networks in JAX via callable PyTrees and filtered transformations »
Patrick Kidger -
2021 Poster: Higher Order Kernel Mean Embeddings to Capture Filtrations of Stochastic Processes »
Cristopher Salvi · Maud Lemercier · Chong Liu · Blanka Horvath · Theodoros Damoulas · Terry Lyons -
2020 Poster: Neural Controlled Differential Equations for Irregular Time Series »
Patrick Kidger · James Morrill · James Foster · Terry Lyons -
2020 Spotlight: Neural Controlled Differential Equations for Irregular Time Series »
Patrick Kidger · James Morrill · James Foster · Terry Lyons -
2019 Poster: Deep Signature Transforms »
Patrick Kidger · Patric Bonnier · Imanol Perez Arribas · Cristopher Salvi · Terry Lyons -
2019 Poster: Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond »
Xuechen (Chen) Li · Denny Wu · Lester Mackey · Murat Erdogdu -
2019 Spotlight: Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond »
Xuechen (Chen) Li · Denny Wu · Lester Mackey · Murat Erdogdu -
2018 Poster: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud -
2018 Oral: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud