The Symbiosis of Deep Learning and Differential Equations -- III

Workshop

The Symbiosis of Deep Learning and Differential Equations -- III

Luca Herranz-Celotti · Martin Magill · Ermal Rrapaj · Winnie Xu · Qiyao Wei · Archis Joglekar · Michael Poli · Animashree Anandkumar

Room 255 - 257

[ Abstract ] Workshop Website

Sat 16 Dec, 6:30 a.m. PST

In the deep learning community, a remarkable trend is emerging, where powerful architectures are created by leveraging classical mathematical modeling tools from diverse fields like differential equations, signal processing, and dynamical systems. Differential equations are a prime example: research on neural differential equations has expanded to include a large zoo of related models with applications ranging from time series analysis to robotics control. Score-based diffusion models are among state-of-the-art tools for generative modelling, drawing connections between diffusion models and neural differential equations. Other examples of deep architectures with important ties to classical fields of mathematical modelling include normalizing flows, graph neural diffusion models, Fourier neural operators, architectures exhibiting domain-specific equivariances, and latent dynamical models (e.g., latent NDEs, H3, S4, Hyena). The previous two editions of the Workshop on the Symbiosis of Deep Learning and Differential Equations have promoted the bidirectional exchange of ideas at the intersection of classical mathematical modelling and modern deep learning. On the one hand, this includes the use of differential equations and similar tools to create neural architectures, accelerate deep learning optimization problems, or study theoretical problems in deep learning. On the other hand, the Workshop also explores the use of deep learning methods to improve the speed, flexibility, or realism of computer simulations. Last year, we noted a particularly keen interest from the audience in neural architectures that leveraged classical mathematical models, such as those listed above. We therefore propose that the third edition of this Workshop focus on this theme.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 6:30 a.m. - 6:45 a.m.	Introduction and opening remarks ( Introduction ) > SlidesLive Video	🔗
Sat 6:45 a.m. - 7:30 a.m.	Philip M. Kim - Machine learning methods for protein, peptide and antibody design. ( Keynote Talk ) > SlidesLive Video	🔗
Sat 7:30 a.m. - 7:45 a.m.	Effective Latent Differential Equation Models via Attention and Multiple Shooting ( Spotlight ) > link SlidesLive Video The GOKU-net is a continuous-time generative model that allows leveraging prior knowledge in the form of differential equations. We present GOKU-UI, an evolution of the GOKU-nets, which integrates attention mechanisms and a novel multiple shooting training strategy in the latent space. On simulated data, GOKU-UI significantly improves performance in reconstruction and forecasting, outperforming baselines even with 16 times less training data. Applied to empirical human brain data, using stochastic Stuart-Landau oscillators, it is able to effectively capture complex brain dynamics, surpassing baselines in reconstruction and better predicting future brain activity up to 15 seconds ahead. Ultimately, our research provides further evidence on the fruitful symbiosis given by the combination of established scientific insights and modern machine learning. Link	Germán Abrevaya · Mahta Ramezanian-Panahi · Jean-Christophe Gagnon-Audet · Pablo Polosecki · Irina Rish · Silvina Ponce Dawson · Guillermo Cecchi · Guillaume Dumas 🔗
Sat 7:45 a.m. - 8:00 a.m.	Adaptive Resolution Residual Networks ( Spotlight ) > link SlidesLive Video We introduce Adaptive Resolution Residual Networks (ARRNs), a form of neural operator that enables the creation of networks for signal-based tasks that can be rediscretized to suit any signal resolution. ARRNs are composed of a chain of Laplacian residuals that each contain ordinary layers, which do not need to be rediscretizable for the whole network to be rediscretizable. ARRNs have the property of requiring a lower number of Laplacian residuals for exact evaluation on lower resolution signals, which greatly reduces computational cost. ARRNs also implement Laplacian dropout, which encourages networks to become robust to low-bandwidth signals. ARRNs can thus be trained once at high-resolution and then be rediscretized on the fly at the suitable resolution with great robustness. Link	Léa Demeule · Mahtab Sandhu · Glen Berseth 🔗
Sat 8:00 a.m. - 8:15 a.m.	Break ( Break ) >	🔗
Sat 8:15 a.m. - 9:00 a.m.	Poster Session 1 ( Poster Session ) >	🔗
Sat 9:00 a.m. - 9:45 a.m.	Yulia Rubanova - Learning efficient and scalable simulation using graph networks. ( Keynote Talk ) > SlidesLive Video	🔗
Sat 9:45 a.m. - 10:00 a.m.	Can Physics informed Neural Operators self improve? ( Spotlight ) > link SlidesLive Video Self-training techniques have shown remarkable value across many deep learning models and tasks. However, such techniques remain largely unexplored when considered in the context of learning fast solvers for systems of partial differential equations (Eg: Neural Operators). In this work, we explore the use of self-training for Fourier Neural Operators (FNO). Neural Operators emerged as a data driven technique, however, data from experiments or traditional solvers is not always readily available. Physics Informed Neural Operators (PINO) overcome this constraint by utilizing a physics loss for the training, however the accuracy of PINO trained without data does not match the performance obtained by training with data. In this work we show that self-training can be used to close this gap in performance. We examine canonical examples, namely the 1D-Burgers and 2D-Darcy PDEs, to showcase the efficacy of self-training. Specifically, FNOs, when trained exclusively with physics loss through self-training, approach $1.07\times$ for Burgers and $1.02\times$ for Darcy, compared to FNOs trained with both data and physics loss. Furthermore, we discover that pseudo-labels can be used for self-training without necessarily training to convergence in each iteration. A consequence of this is that we are able to discover self-training schedules that improve upon the baseline performance of PINO in terms of accuracy as well as time. Link	Ritam Majumdar · Amey Varhade · Shirish Karande · Lovekesh Vig 🔗
Sat 10:00 a.m. - 11:00 a.m.	Lunch Break ( Lunch Break ) >	🔗
Sat 11:00 a.m. - 11:45 a.m.	Michael Bronstein - Physics-inspired learning on graphs. ( Keynote Talk ) > SlidesLive Video	🔗
Sat 11:45 a.m. - 12:00 p.m.	Vertical AI-driven Scientific Discovery ( Spotlight ) > link SlidesLive Video Automating scientific discovery has been a grand goal of Artificial Intelligence (AI) and will bring tremendous societal impact if it succeeds. Despite exciting progress, most endeavor in learning scientific equations from experiment data focuses on the horizontal discovery paths, i.e., they directly search for the best equation in the full hypothesis space. Horizontal paths are challenging because of the associated exponentially large search space. Our work explores an alternative vertical path, which builds scientific equations in an incremental way, starting from one that models data in control variable experiments in which most variables are held as constants. It then extends expressions learned in previous generations via adding new independent variables, using new control variable experiments in which these variables are allowed to vary. This vertical path was motivated by human scientific discovery processes. Experimentally, we demonstrate that such vertical discovery paths expedite symbolic regression. It also improves learning physics models describing nano-structure evolution in computational materials science. Link	Yexiang Xue 🔗
Sat 12:00 p.m. - 12:15 p.m.	ELeGANt: An Euler-Lagrange Analysis of Wasserstein Generative Adversarial Networks ( Spotlight ) > link SlidesLive Video We consider Wasserstein generative adversarial networks (WGAN) with a gradient-norm penalty and analyze the underlying {\it functional} optimization problem within a variational setting. The optimal discriminator in this setting is the solution to a Poisson differential equation, and can be obtained in closed form without having to train a neural network. We illustrate this by employing a Fourier-series approximation to solve the Poisson differential equation. Experimental results based on synthesized low-dimensional Gaussian data demonstrate superior convergence behavior of the proposed approach in comparison with the baseline WGAN variants that employ weight-clipping, gradient or Lipschitz penalties on the discriminator. Further, within this setting, the optimal Lagrange multiplier can be computed in closed-form, and serves as a proxy for measuring GAN generator convergence. This work is an extended abstract, summarizing Asokan and Seelamantula (2023). Link	Siddarth Asokan · Chandra Seelamantula 🔗
Sat 12:15 p.m. - 1:00 p.m.	Albert Gu - Structured State Space Models for Deep Sequence Modeling. ( Keynote Talk ) > SlidesLive Video	🔗
Sat 1:00 p.m. - 1:15 p.m.	TANGO: Time-reversal Latent GraphODE for Multi-Agent Dynamical Systems ( Spotlight ) > link SlidesLive Video Learning complex multi-agent system dynamics from data is crucial across many domains like physical simulations and material modeling. Existing physics-informed approaches, like Hamiltonian Neural Network, introduce inductive bias by strictly following energy conservation law. However, many real-world systems do not strictly conserve energy. Thus, we focus on Time-Reversal Symmetry, a broader physical principle indicating that system dynamics should remain invariant when time is reversed. This principle not only preserves energy in conservative systems but also serves as a strong inductive bias for non-conservative, reversible systems.In this paper, we propose a simple-yet-effective self-supervised regularization term as a soft constraint that aligns the forward and backward trajectories predicted by a continuous graph neural network-based ordinary differential equation (GraphODE). In addition, we theoretical show that our regularization essentially minimizes higher-order Taylor expansion terms during the ODE integration steps, which enables our model to be more noise-tolerant and even applicable to irreversible systems. Link	Zijie Huang · Wanjia Zhao · Jingdong Gao · Ziniu Hu · Xiao Luo · Yadi Cao · Yuanzhou Chen · Yizhou Sun · Wei Wang 🔗
Sat 1:15 p.m. - 1:30 p.m.	Break ( Break ) >	🔗
Sat 1:30 p.m. - 2:30 p.m.	Poster Session 2 ( Poster Session ) >	🔗
Sat 2:30 p.m. - 2:45 p.m.	Closing Remarks ( Closing Remarks ) > SlidesLive Video	🔗
-	Physics-Informed Transformer Networks ( Poster ) > link Physics-informed neural networks have been recognized as a viable alternative to conventional numerical solvers for Partial Differential Equations (PDEs). However, a key challenge is their limited generalization across varied initial conditions. Addressing this, our study presents a novel physics-informed transformer model for learning the solution operator for PDEs. Leveraging the attention mechanism, our model is able to explore the relationships between its inputs. Furthermore, by using a physics-informed loss, our model is able to train without requiring ground-truth solutions as labelled training data, which are often costly to obtain. Additionally, our model is invariant to the discretization of the input domain, thus providing great flexibility. We validated our proposed method on the 1D Burgers' and the 2D Heat equations, demonstrating the model's competitive results compared to other standard physics-informed models for operator learning. Link	Fabricio Dos Santos · Tara Akhound-Sadegh · Siamak Ravanbakhsh 🔗
-	Generalized One-Shot Transfer Learning of Linear Ordinary and Partial Differential Equations ( Poster ) > link We present a generalizable methodology to perform "one-shot" transfer learning on systems of linear ordinary and partial differential equations using physics informed neural networks (PINNs). PINNS have attracted researchers as an avenue through which both data and studied physical constraints can be leveraged in learning solutions to differential equations. Despite their benefits, PINNs are currently limited by the computational costs needed to train such networks on different but related tasks. Transfer learning addresses this drawback. In this work, we present a generalizable methodology to perform "one-shot" transfer learning on linear systems of equations. First, we describe a process to train PINNs on equations with varying conditions across multiple "heads". Second, we show how this multi-headed training process can be used to yield a latent space representation of a particular differential equation form. Third, we derive closed-form formulas, which represent generalized network weights that minimize the loss function. Finally, we demonstrate how the learned latent representation and derived network weights can be utilized to instantaneously transfer learn solutions to equations, demonstrating the ability to quickly solve many systems of equations in a variety of environments. Link	Pavlos Protopapas · Hari Raval 🔗
-	Towards Optimal Network Depths: Control-Inspired Acceleration of Training and Inference in Neural ODEs ( Poster ) > link Neural Ordinary Differential Equations (ODEs) offer potential for learning continuous dynamics, but their slow training and inference limit broader use. This paper proposes spatial and temporal optimization inspired by control theory. It seeks an optimal network depth to accelerate both training and inference while maintaining performance. Two approaches are presented: one treats training as a single-stage minimum-time optimal control problem, adjusting terminal time, and the other combines pre-training with Lyapunov method, followed by safe terminal time updates in a secondary stage. Experiments confirm the effectiveness of addressing Neural ODEs' speed limitations. Link	Keyan Miao · Konstantinos Gatsis 🔗
-	Causal Graph ODE: Continuous Treatment Effect Modeling in Multi-agent Dynamical Systems ( Poster ) > link Real-world multi-agent systems are often dynamic and continuous, where agents interact over time and undergo changes in their trajectories. For example, the COVID-19 transmission in the U.S. can be viewed as a multi-agent system, where states act as agents and daily population movements between them are interactions. Estimating the counterfactual outcomes in such systems enables accurate future predictions and effective decision-making, such as formulating COVID-19 policies.However, existing methods fail to model the continuous dynamic effects of treatments on the outcome, especially when multiple treatments are applied simultaneously. To tackle this challenge, we propose Causal Graph Ordinary Differential Equations (CAG-ODE), a novel model that captures the continuous interaction among agents using a Graph Neural Network (GNN) as the ODE function. The key innovation of our model is to learn time-dependent representations of treatments and incorporate them into the ODE function, enabling precise predictions of potential outcomes. To mitigate confounding bias, we further propose two domain adversarial learning-based objectives, which enable our model to learn balanced continuous representations that are not affected by treatments or interference. Experiments on two datasets demonstrate the superior performance of CAG-ODE. Link	Zijie Huang · Jeehyun Hwang · Junkai Zhang · Jinwoo Baik · Weitong ZHANG · Quanquan Gu · Dominik Wodarz · Yizhou Sun · Wei Wang 🔗
-	Data-Driven Neural-ODE Modeling for Breast Cancer Tumor Dynamics and Progression-Free Survival Predictions ( Poster ) > link Pharmacokinetic/Pharmacodynamic (PK/PD) modeling plays a pivotal role in novel drug development. Previous population-based PK/PD models encounter challenges when customized for individual patients. We aimed to investigate the feasibility of constructing a pharmacodynamic model for different phases of individual breast cancer pharmacodynamics, only leveraging limited data from early phases. To achieve that, we introduced an innovative approach, Data-driven Neural Ordinary Differential Equation (DN-ODE) modeling for multi-task, e.g., breast cancer tumor dynamics and progression-free survival predictions. To validate the DN-ODE approach, we conducted experiments with early-phase clinical trial data from the amcenestrant (an oral treatment for breast cancer) dataset (AMEERA 1-2) to predict pharmacodynamics in the later phase (AMEERA 3). Empirical investigations confirmed the efficacy of the DN-ODE, surpassing alternative PK/PD methodologies. Notably, we also introduced visualizations for each patient, demonstrating that the DN-ODE recognizes diverse tumor growth patterns (responded, progressed, and stable). Therefore, the DN-ODE model offers a promising tool for researchers and clinicians, enabling a comprehensive assessment of drug efficacy, identification of potential Link	Jinlin Xiang · Bozhao Qi · Qi Tang · Marc Cerou · Wei Zhao 🔗
-	Orthogonal Polynomials Quadrature Algorithm: a functional analytic approach to inverse problems in deep learning ( Poster ) > link We present the new Orthogonal Polynomials--Quadrature Algorithm (OPQA), a parallelizable algorithm that solves two common inverse problems in deep learning from a functional analytic approach. First, it finds a smooth probability density function as an estimate of the posterior, which can act as a proxy for fast inference; second, it estimates the evidence, which is the likelihood that a particular set of observations can be obtained. Everything can be parallelized and completed in one pass.A core component of OPQA is a functional transform of the square root of the joint distribution into a special functional space of our construct. Through this transform, the evidence is equated with the $L^2$ norm of the transformed function, squared. Hence, the evidence can be estimated by the sum of squares of the transform coefficients.To expedite the computation of the transform coefficients, OPQA proposes a new computational scheme leveraging Gauss--Hermite quadrature in higher dimensions. Not only does it avoid the potential high variance problem associated with random sampling methods, it also enables one to speed up the computation by parallelization, and significantly reduces the complexity by a vector decomposition. Link	Lilian Wong 🔗
-	Advancing Graph Neural Networks Through Joint Time-Space Dynamics ( Poster ) > link We introduce the GeneRAlized Fractional Time-space graph diffusion network (GRAFT), a framework combining temporal and spatial nonlocal operators on graphs to effectively capture long-range interactions across time and space. Leveraging time-fractional diffusion processes, GRAFT encompasses a system's full historical context, while the $d$-path Laplacian diffusion ensures extended spatial interactions based on shortest paths. Notably, GRAFT mitigates the over-squashing problem common in graph networks. Empirical results show its prowess on self-similar, tree-like data due to its fractal-conscious design with fractional time derivatives. We delve deeply into the mechanics of GRAFT, emphasizing its distinctive ability to encompass both time and space diffusion processes through a random walk perspective. Link	Qiyu Kang · Yanan Zhao · Kai Zhao · Xuhao Li · Qinxu Ding · Wee Peng Tay · Sijie Wang 🔗
-	Two-Step Bayesian PINNs for Uncertainty Estimation ( Poster ) > link We use a two-step procedure to train Bayesian neural networks that provide uncertainties over the solutions to differential equation (DE) systems provided by Physics-Informed Neural Networks (PINNs). We take advantage of available error bounds over PINNs to formulate a heteroscedastic variance that improves the uncertainty estimation. Furthermore, we solve forward problems and utilize the uncertainties obtained to improve parameter estimation in inverse problems in the fields of cosmology and fermentation. Link	Pablo Flores · Olga Graf · Pavlos Protopapas · Karim Pichara 🔗
-	$ODE$Solvers are also Wayfinders: Neural ODEs for Multi-Agent Pathplanning ( Poster ) > link Multi-agent path planning is a central challenge in areas such as robotics, autonomous vehicles, and swarm intelligence. Traditional discrete methods often struggle with real-time adaptability and computational efficiency, emphasizing the need for continuous, optimizable solutions. This paper introduces a novel approach that harnesses Neural Ordinary Differential Equations (Neural ODEs) for multi-agent path planning in a continuous-time framework. By parameterizing agent dynamics using neural networks within these ODEs, we enable end-to-end trajectory optimization. The inherent dynamics of ODEs facilitate collision avoidance. We demonstrate our method's effectiveness across both 2D and 3D scenarios, navigating multiple agents amidst obstacles, underscoring the potential of Neural ODEs to transform path planning. Link	Progyan Das · Dwip Dalal 🔗
-	Physics-Informed Neural Operators with Exact Differentiation on Arbitrary Geometries ( Poster ) > link Neural Operators can learn operators from data, for example, to solve partial differential equations (PDEs). In some cases, this data-driven approach is not sufficient, e.g., if the data is limited, or only available at a resolution that does not permit resolving the underlying physics. The Physics-Informed Neural Operator (PINO) aims to solve this issue by adding the PDE residual as a loss to the Fourier Neural Operator (FNO). Several methods have been proposed to compute the derivatives appearing in the PDE, such as finite differences and Fourier differentiation. However, these methods are limited to regular grids and suffer from inaccuracies. In this work, we propose the first method capable of exact derivative computations for general functions on arbitrary geometries. We leverage the Geometry Informed Neural Operator (GINO), a recently proposed graph-based extension of FNO. While GINO can be queried at arbitrary points in the output domain, it is not differentiable with respect to those points due to a discrete neighbor search procedure. We introduce a fully differentiable extension of GINO that uses a differentiable weight function and neighbor caching in order to maintain the efficiency of GINO while allowing for exact derivatives. We empirically show that our method matches prior PINO methods while being the first to compute exact derivatives for arbitrary query points. Link	Colin White · Julius Berner · Jean Kossaifi · Mogab Elleithy · David Pitt · Daniel Leibovici · Zongyi Li · Kamyar Azizzadenesheli · Animashree Anandkumar 🔗
-	PINNs-Torch: Enhancing Speed and Usability of Physics-Informed Neural Networks with PyTorch ( Poster ) > link Physics-informed neural networks (PINNs) stand out for their ability in supervised learning tasks that align with physical laws, especially nonlinear partial differential equations (PDEs). In this paper, we introduce "PINNs-Torch", a Python package that accelerates PINNs implementation using the PyTorch framework and streamlines user interaction by abstracting PDE issues. While we utilize PyTorch's dynamic computational graph for its flexibility, we mitigate its computational overhead in PINNs by compiling it to static computational graphs. In our assessment across 8 diverse examples, covering continuous, discrete, forward, and inverse configurations, naive PyTorch is slower than TensorFlow; however, when integrated with CUDA Graph and JIT compilers, training speeds can increase by up to 9 times relative to TensorFlow implementations. Additionally, through a real-world example, we highlight situations where our package might not deliver speed improvements. For community collaboration and future developments, our package code is accessible at: \texttt{link}. Link	Reza Akbarian Bafghi · Maziar Raissi 🔗
-	One-Shot Transfer Learning for Nonlinear ODEs ( Poster ) > link We introduce a generalizable approach that combines perturbation method and one-shot transfer learning to solve nonlinear ODEs with a single polynomial term, using Physics-Informed Neural Networks (PINNs). Our method transforms non-linear ODEs into linear ODE systems, trains a PINN across varied conditions, and offers a closed-form solution for new instances within the same non-linear ODE class. We demonstrate the effectiveness of this approach on the Duffing equation and suggest its applicability to similarly structured PDEs and ODE systems. Link	Wanzhou Lei · Pavlos Protopapas · Joy Parikh 🔗
-	Deep PDE Solvers for Subgrid Modelling and Out-of-Distribution Generalization ( Poster ) > link Climate and weather modelling (CWM) is an important area where ML models are used for subgrid modelling: making predictions of processes occurring at scales too small to be resolved by standard solution methods (Brasseur & Jacob, 2017). These models are expected to make accurate predictions, even on out-of-distribution (OOD) data, and are additionally expected to respect important physical constraints of the ground truth model (Kashinath et al., 2021). While many specialized ML PDE solvers have been developed, the particular requirements of CWM models have not been addressed so far. The goal of this work is to address them. We propose and develop a novel architecture, which matches or exceeds the performance of standard ML models, and which demonstrably succeeds in OOD generalization. The architecture is based on expert knowledge of the structure of PDE solution operators, which permits the model to also obey important physical constraints. Link	Patrick Chatain · Adam Oberman 🔗
-	Multiscale Neural Operators for Solving Time-Independent PDEs ( Poster ) > link Time-independent Partial Differential Equations (PDEs) on large meshes pose significant challenges for data-driven neural PDE solvers. We introduce a novel graph rewiring technique to tackle some of these challenges, such as aggregating information across scales and on irregular meshes. Our proposed approach bridges distant nodes, enhancing the global interaction capabilities of GNNs. Our benchmarks on three datasets reveal that GNN-based methods set new performance standards for time-independent PDEs on irregular meshes. Finally, we show that our graph rewiring strategy boosts the performance of baseline methods, achieving state-of-the-art results in one of the tasks. Link	Winfried Ripken · Lisa Coiffard · Felix Pieper · Sebastian Dziadzio 🔗
-	Individualized Dosing Dynamics via Neural Eigen Decomposition ( Poster ) > link Dosing models often use differential equations to model biological dynamics.Neural differential equations in particular can learn to predict the derivative of a process, which permits predictions at irregular points of time.However, this temporal flexibility often comes with a high sensitivity to noise, whereas medical problems often present high noise and limited data.Moreover, medical dosing models must generalize reliably over individual patients and changing treatment policies.To address these challenges, we introduce the Neural Eigen Stochastic Differential Equation algorithm (NESDE).NESDE provides individualized modeling (using patient-level parameters); generalization to new treatment policies (using decoupled control); tunable expressiveness according to the noise level (using piecewise linearity); and fast, continuous, closed-form prediction (using spectral representation).We demonstrate the robustness of NESDE in real medical problems, and use the learned dynamics to publish simulated medical gym environments. Link	Stav Belogolovsky · Ido Greenberg · Danny Eytan · Shie Mannor 🔗
-	Neural Differential Recurrent Neural Network with Adaptive Time Steps ( Poster ) > link The neural Ordinary Differential Equation (ODE) model has shown success in learningcontinuous-time processes from observations on discrete time stamps. In this work, we consider the modeling and forecasting of time series data that are non-stationary and may have sharp changes like spikes. We propose an RNN-based model, called $\textit{RNN-ODE-Adap}$, that uses a neural ODE to represent the time development of the hidden states, and we adaptively select time steps based on the steepness of changes of the data over time so as to train the model more efficiently for the ''spike-like'' time series. Theoretically, $\textit{RNN-ODE-Adap}$ yields provably a consistent estimation of the intensity function for the Hawkes-type time series data. We also provide an approximation analysis of the RNN-ODE model showing the benefit of adaptive steps. The proposed model is demonstrated to achieve higher prediction accuracy with reduced computational cost on simulated dynamic system data and point process data and on a real electrocardiography dataset. Link	Yixuan Tan · Liyan Xie · Xiuyuan Cheng 🔗
-	Evaluating Uncertainty Quantification approaches for Neural PDEs in scientific application ( Poster ) > link The accessibility of spatially distributed data, enabled by affordable sensors, field, and numerical experiments, has facilitated the development of data-driven solutions for scientific problems, including climate change, weather prediction, and urban planning. Neural Partial Differential Equations (Neural PDEs), which combine deep learning (DL) techniques with domain expertise (e.g., governing equations) for parameterization, have proven to be effective in capturing valuable correlations within spatiotemporal datasets. However, sparse and noisy measurements coupled with modeling approximation introduce aleatoric and epistemic uncertainties. Therefore, quantifying uncertainties propagated from model inputs to outputs remains a challenge and an essential goal for establishing the trustworthiness of Neural PDEs. This work evaluates various Uncertainty Quantification (UQ) approaches for both Forward and Inverse Problems in scientific applications. Specifically, we investigate the effectiveness of Bayesian methods, such as Hamiltonian Monte Carlo (HMC) and Monte-Carlo Dropout (MCD), and a more conventional approach, Deep Ensembles (DE). To illustrate their performance, we take two canonical PDEs: Burger's equation and the Navier-Stokes equation. Our results indicate that Neural PDEs can effectively reconstruct flow systems and predict the associated unknown parameters. However, it is noteworthy that the results derived from Bayesian methods, based on our observations, tend to display a higher degree of certainty in their predictions as compared to those obtained using the DE. This elevated certainty in predictions suggests that Bayesian techniques might underestimate the true underlying uncertainty, thereby appearing more confident in their predictions than the DE approach. Link	Vardhan Dongre · Gurpreet Singh Hora 🔗
-	Multimodal base distributions for continuous-time normalising flows ( Poster ) > link We investigate the utility of a multimodal base distribution in continuous-time normalising flows. Multimodality is incorporated through a Gaussian mixture model (GMM) centred at the empirical means of a target distribution's modes. In- and out-of-distribution likelihoods are reported for flows trained with a unimodal and multimodal base distribution. Our results show that the GMM base distribution leads to performance that is comparable to a standard (unimodal) Gaussian distribution for in-distribution likelihoods, but provides the ability to sample from a specific mode in the target distribution, yields generated samples of improved quality, and gives more reliable out-of-distribution likelihoods for low-dimensional input spaces. We conclude that a GMM base distribution is an attractive alternative to the standard base, whose inclusion incurs little to no cost and whose parameterisation may assist with more reliable out-of-distribution likelihoods. Link	Shane Josias · Willie Brink 🔗
-	Unifying Neural Controlled Differential Equations and Neural Flow for Irregular Time Series Classification ( Poster ) > link Real-world time series data frequently exhibits irregular sampling intervals and may contain missing values, posing challenges for effective analysis and modeling. To handle these complexities, we present a groundbreaking approach that synergistically combines Neural Controlled Differential Equations (Neural CDEs) with Neural Flows. Central to our methodology is the introduction of a dual latent space, meticulously designed to discern and stabilize latent values amidst the irregularities intrinsic to the sampled time series data. Our empirical investigations span across 18 datasets, encompassing three distinct domains, and tested under four different missing rate scenarios. The findings consistently underscore the superiority of our proposed model over existing benchmarks in the classification of irregularly-sampled time series data. Such robust performance accentuates our model's versatility, making it a promising candidate for the real-world applications. Link	YongKyung Oh · Dongyoung Lim · SUNGIL KIM 🔗
-	Enhanced Distribution Modelling via Augmented Architectures For Neural ODE Flows ( Poster ) > link While the neural ODE formulation of normalizing flows such as in FFJORD enables us to calculate the determinants of free form Jacobians in $\mathcal{O}(D)$ time, the flexibility of the transformation underlying neural ODEs has been shown to be suboptimal. In this paper, we present AFFJORD, a neural ODE-based normalizing flow which enhances the representation power of FFJORD by defining the neural ODE through special augmented transformation dynamics which preserve the topology of the space. Furthermore, we derive the Jacobian determinant of the general augmented form by generalizing the chain rule in the continuous sense into the $\textit{cable rule}$, which expresses the forward sensitivity of ODEs with respect to their initial conditions. The cable rule gives an explicit expression for the Jacobian of a neural ODE transformation, and provides an elegant proof of the instantaneous change of variable. Our experimental results on density estimation in synthetic and high dimensional data, such as MNIST, CIFAR-10 and CelebA ($32\times32$), show that AFFJORD outperforms the baseline FFJORD through the improved flexibility of the underlying vector field. Link	Etrit Haxholli · Marco Lorenzi 🔗
-	Does In-Context Operator Learning Generalize to Domain-Shifted Settings? ( Poster ) > link Neural network-based approaches for learning differential equations (DEs) have demonstrated generalization capabilities within a DE solution or operator instance. However, because standard techniques can only represent the solution function or operator for a single system at a time, the broader notion of generalization across classes of DEs has so far gone unexplored. In this work, we investigate whether commonalities across DE classes can be leveraged to transfer knowledge about solving one DE towards solving another --- without updating any model parameters. To this end, we leverage the recently-proposed in-context operator learning (ICOL) framework, which trains a model to identify in-distribution operators given a small number of input-output pairs as examples. Our implementation is motivated by pseudospectral methods, a class of numerical solvers that can be systematically applied to a range of DEs. For a natural distribution of 1D linear ordinary differential equations (ODEs), we identify a connection between operator learning and in-context linear regression. Applying recent results demonstrating the capabilities of Transformers to in-context learn linear functions, our reduction to least squares helps to explain why Transformers can be expected to solve ODEs in-context. Empirically, we demonstrate that ICOL is robust to a range of distribution shifts, including observational noise, domain-shifted inputs, varying boundary conditions, and surprisingly, even operators from functional forms unseen during training. Link	Jerry Liu · N. Benjamin Erichson · Kush Bhatia · Michael Mahoney · Christopher Ré 🔗
-	On the Generalization of Deep Neural Networks for Optimal Sensor Placement in Global Ocean Forecasting ( Poster ) > link The focus of this study is on the generalization of neural networks, particularly in the context of sensor placement for global climate models' forecasts. The goal is to determine if sensor placement strategies derived through training a deep learning model, which is tasked with reconstructing a physical field from a set of measurements, can be effectively applied to a real high-resolution ocean global circulation model. The research compares different sensor placement methods, including one achieved using the Concrete Autoencoder method. Through modeling under varied initial conditions of the World Ocean state, it was found that sensor placements informed by deep learning methods outperformed others in forecast accuracy when using a comparable number of sensors. This finding underscores the potential of deep learning-informed sensor placement as a powerful tool for refining the predictive capabilities of global climate models and accelerating the data assimilation system without extensive revisions to their source code. Link	Alexander Lobashev · Nikita Turko · Konstantin Ushakov · Maxim Kaurkin · Rashit Ibrayev 🔗
-	A Holistic Vision: Modeling Patient Trajectories in Longitudinal Medical Imaging ( Poster ) > link In medical data analysis, human practitioners have long excelled at adopting a holistic approach, considering a wide range of patient information, including multiple imaging sources and evolving medical histories. A prime example can be found in tumor boards, where, among others, radiologists evaluate a multitude of images while taking into account dynamic patient narratives. However, within the domain of medical image analysis, the current focus often narrows down to individual images, or if longitudinal data is used, the task is to infer non-dense predictions, like classification.However, if we can leverage multiple time points and model a patient trajectory, we can predict patient status on an image level at an arbitrary time point in the future.It could not only lead to better predictions for the current time point, i.e. for image segmentation, but it can also lead to a more holistic approach in medical image analysis, more akin to the human approach.In response to this disparity, our work, we motivate the need for longitudinal medical image analysis and we present a model that can deal with sparse and irregular longitudinal series, and without sacrificing generality, generate images. Link	Nico Disch · David Zimmerer 🔗
-	Solving Noisy Inverse Problems via Posterior Sampling: A Policy Gradient View-Point ( Poster ) > link Solving image inverse problems (e.g., super-resolution and inpainting) requires generating a high fidelity image that matches the given input (the low-resolution image or the masked image). By using the input image as guidance, we can leverage a pretrained diffusion generation model to solve a wide range of image inversion tasks without task specific model fine-tuning. In this work, we propose diffusion policy gradient (DPG), a tractable computation method to estimate the score function given the guidance image. Our method is robust to both Gaussian and Poisson noise added to the input image, and it improves the image restoration consistency and quality on FFHQ, ImageNet and LSUN datasets on both linear and non-linear image inversion tasks (inpainting, super-resolution, motion deblur, non-linear deblur, etc.). Link	Haoyue Tang · Tian Xie · Aosong Feng · Hanyu Wang · Chenyang Zhang · Yang Bai 🔗
-	Neural oscillators for generalizing parametric PDEs ( Poster ) > link Parametric partial differential equations (PDEs) are ubiquitous in various scientific and engineering fields, manifesting the behavior of systems under varying parameters. Predicting solutions over a parametric space is desirable but prohibitively costly and challenging. In addition, recent neural PDE solvers are usually limited to interpolation scenarios, where solutions are predicted for inputs within the support of the training set. This work proposes to utilize neural oscillators to extend predictions for parameters beyond the trained regime, effectively extrapolating the parametric space. The proposed methodology is validated on three parametric PDEs: linear advection, viscous burgers, and nonlinear heat. The results underscore the promising potential of neural oscillators in extrapolation scenarios for both linear and nonlinear parametric PDEs. Link	Taniya Kapoor · Abhishek Chandra · Daniel Tartakovsky · Hongrui Wang · Alfredo Nunez · Rolf Dollevoet 🔗