Timezone: »

Workshop
Machine Learning and the Physical Sciences
Atilim Gunes Baydin · Adji Bousso Dieng · Emine Kucukbenli · Gilles Louppe · Siddharth Mishra-Sharma · Benjamin Nachman · Brian Nord · Savannah Thais · Anima Anandkumar · Kyle Cranmer · Lenka Zdeborová · Rianne van den Berg

Sat Dec 03 05:50 AM -- 03:00 PM (PST) @ Room 275 - 277

The Machine Learning and the Physical Sciences workshop aims to provide an informal, inclusive and leading-edge venue for research and discussions at the interface of machine learning (ML) and the physical sciences. This interface spans (1) applications of ML in physical sciences (ML for physics), (2) developments in ML motivated by physical insights (physics for ML), and most recently (3) convergence of ML and physical sciences (physics with ML) which inspires questioning what scientific understanding means in the age of complex-AI powered science, and what roles machine and human scientists will play in developing scientific understanding in the future.

 Sat 5:50 a.m. - 6:00 a.m. Opening remarks (Introduction to the Workshop) 🔗 Sat 6:00 a.m. - 6:30 a.m. Invited talk: David Pfau (Invited talk) David Pfau · Siddharth Mishra-Sharma 🔗 Sat 6:30 a.m. - 6:45 a.m. Contributed talk: Kieran Murphy -- Characterizing information loss in a chaotic double pendulum with the Information Bottleneck (Contributed talk) Kieran Murphy · Siddharth Mishra-Sharma 🔗 Sat 6:45 a.m. - 7:15 a.m. Invited talk: Hiranya Peiris (Invited talk) Hiranya Peiris · Siddharth Mishra-Sharma 🔗 Sat 7:15 a.m. - 7:30 a.m. Contributed talk: Marco Aversa -- Physical Data Models in Machine Learning Imaging Pipelines (Contributed talk) Marco Aversa · Siddharth Mishra-Sharma 🔗 Sat 7:30 a.m. - 8:00 a.m. Invited Talk: Giorgio Parisi (Invited talk) 🔗 Sat 8:00 a.m. - 9:00 a.m. Poster session and break 🔗 Sat 9:00 a.m. - 10:00 a.m. Panel: Kathleen Creel, Mario Krenn, and Emily Sullivan (Panel) Philosophy of Science in the AI Era 🔗 Sat 10:00 a.m. - 11:15 a.m. Lunch 🔗 Sat 11:15 a.m. - 11:45 a.m. Invited talk: E. Doğuş Çubuk (Invited talk) Ekin Dogus Cubuk · Siddharth Mishra-Sharma 🔗 Sat 11:45 a.m. - 12:15 p.m. Invited talk: Vinicius Mikuni (Invited talk) Vinicius Mikuni · Siddharth Mishra-Sharma 🔗 Sat 12:15 p.m. - 12:30 p.m. Contributed talk: Aurélien Dersy -- Simplifying Polylogarithms with Machine Learning (Contributed talk) Aurelien Dersy · Siddharth Mishra-Sharma 🔗 Sat 12:30 p.m. - 1:00 p.m. Invited talk: Federico Felici (Invited talk) Federico Felici · Siddharth Mishra-Sharma 🔗 Sat 1:00 p.m. - 1:15 p.m. Contributed talk: Alexandre Adam -- Posterior samples of source galaxies in strong gravitational lenses with score-based priors (Contributed talk) Alexandre Adam · Siddharth Mishra-Sharma 🔗 Sat 1:15 p.m. - 1:30 p.m. Break 🔗 Sat 1:30 p.m. - 2:00 p.m. Invited talk: Catherine Nakalembe and Hannah Kerner (Invited talk) Catherine Nakalembe · Hannah Kerner · Siddharth Mishra-Sharma 🔗 Sat 2:00 p.m. - 2:05 p.m. Closing remarks 🔗 Sat 2:05 p.m. - 3:00 p.m. Poster session 🔗 - Leveraging the Stochastic Predictions of Bayesian Neural Networks for Fluid Simulations (Poster) []  We investigate uncertainty estimation and multimodality via the non-deterministic predictions of Bayesian neural networks (BNNs) in fluid simulations. To this end, we deploy BNNs in two challenging experimental test-cases: We show that BNNs, when used as surrogate models for steady-state fluid flow predictions, provide accurate physical predictions together with sensible estimates of uncertainty.In our main experiment, we study BNNs in the context of differentiable solver interactions with turbulent plasma flows. We find that BNN-based corrector networks can stabilize coarse-grained simulations and successfully create diverse trajectories. Maximilian Mueller · Robin Greif · Frank Jenko · Nils Thuerey 🔗 - Discovering Long-period Exoplanets using Deep Learning with Citizen Science Labels (Poster) []  Automated planetary transit detection has become vital to prioritize candidates for expert analysis given the scale of modern telescopic surveys. While current methods for short-period exoplanet detection work effectively due to periodicity in the light curves, there lacks a robust approach for detecting single-transit events. However, volunteer-labelled transits recently collected by the Planet Hunters TESS (PHT) project now provide an unprecedented opportunity to investigate a data-driven approach to long-period exoplanet detection. In this work, we train a 1-D convolutional neural network to classify planetary transits using PHT volunteer scores as training data. We find using volunteer scores significantly improves performance over synthetic data, and enables the recovery of known planets at a precision and rate matching that of the volunteers. Importantly, the model also recovers transits found by volunteers but missed by current automated methods. Shreshth A Malik · Nora Eisner · Chris Lintott · Yarin Gal 🔗 - HIGlow: Conditional Normalizing Flows for High-Fidelity HI Map Modeling (Poster) []  Extracting the maximum amount of cosmological and astrophysical information from upcoming large-scale surveys remains a challenge. This includes evaluating the exact likelihood, parameter inference and generating new diverse synthetic examples of the incoming high-dimensional data sets. In this work, we propose the use of normalizing flows as a generative model of the neutral hydrogen (HI) maps from the CAMELS project. Normalizing flows have been very successful at parameter inference and generating new, realistic examples. Our model utilizes the spatial structure of the HI maps in order to faithfully follow the statistics of the data, allowing for high-fidelity sample generation and efficient parameter inference. Roy Friedman · Sultan Hassan 🔗 - Virgo: Scalable Unsupervised Classification of Cosmological Shock Waves (Poster) []  Cosmological shock waves are essential to understanding the formation of cosmological structures. To study them, scientists run computationally expensive high-resolution 3D hydrodynamic simulations. Interpreting the simulation results is challenging because the resulting data sets are enormous, and the shock wave surfaces are hard to separate and classify due to their complex morphologies and multiple shock fronts intersecting. We introduce a novel pipeline, Virgo, combining physical motivation, scalability, and probabilistic robustness to tackle this unsolved unsupervised classification problem. To this end, we employ kernel principal component analysis with low-rank matrix approximations to denoise data sets of shocked particles and create labeled subsets. We perform supervised classification to recover full data resolution with stochastic variational deep kernel learning. We evaluate on three state-of-the-art data sets with varying complexity and achieve good results. The proposed pipeline runs automatically, has few hyperparameters, and performs well on all tested data sets. Our results are promising for large-scale applications, and we highlight now enabled future scientific work. Max Lamparth · Ludwig Böss · Ulrich Steinwandel · Klaus Dolag 🔗 - Learning Feynman Diagrams using Graph Neural Networks (Poster) []  In the wake of the growing popularity of machine learning in particle physics, this work finds a new application of geometric deep learning on Feynman diagrams to make accurate and fast matrix element predictions with the potential to be used in analysis of quantum field theory. This research uses the graph attention layer which makes matrix element predictions to 1 significant figure accuracy above 90% of the time. Peak performance was achieved in making predictions to 3 significant figure accuracy over 10% of the time with less than 200 epochs of training, serving as a proof of concept on which future works can build upon for better performance. Finally, a procedure is suggested, to use the network to make advancements in quantum field theory by constructing Feynman diagrams with effective particles that represent non-perturbative calculations. Alexander Norcliffe · Harrison Mitchell · Pietro Lió 🔗 - Certified data-driven physics-informed greedy auto-encoder simulator (Poster) []  A parametric adaptive greedy Latent Space Dynamics Identification (gLaSDI) framework is developed for accurate, efficient, and certified data-driven physics-informed greedy auto-encoder simulators of high-dimensional nonlinear dynamical systems. In the proposed framework, an auto-encoder and dynamics identification models are trained interactively to discover intrinsic and simple latent-space dynamics. To effectively explore the parameter space for optimal model performance, an adaptive greedy sampling algorithm integrated with a physics-informed error indicator is introduced to search for optimal training samples on the fly, outperforming the conventional predefined uniform sampling. Further, an efficient k-nearest neighbor convex interpolation scheme is employed to exploit local latent-space dynamics for improved predictability. Numerical results demonstrate that the proposed method achieves 121 to 2,658x speed-up with 1 to 5% relative errors for radial advection dynamical problems. Xiaolong He · Youngsoo Choi · William Fries · Jon Belof · Jiun-Shyan Chen 🔗 - Physics-Informed Machine Learning of Dynamical Systems for Efficient Bayesian Inference (Poster) []  Although the no-u-turn sampler (NUTS) is a widely adopted method for performing Bayesian inference, it requires numerous posterior gradients which can be expensive to compute in practice. Recently, there has been a significant interest in physics-based machine learning of dynamical (or Hamiltonian) systems and Hamiltonian neural networks (HNNs) is a noteworthy architecture. But these types of architectures have not been applied to solve Bayesian inference problems efficiently. We propose the use of HNNs for performing Bayesian inference efficiently without requiring numerous posterior gradients. We introduce latent variable outputs to HNNs (L-HNNs) for improved expressivity and reduced integration errors. We integrate L-HNNs in NUTS and further propose an online error monitoring scheme to prevent sampling degeneracy in regions where L-HNNs may have little training data. We demonstrate L-HNNs in NUTS with online error monitoring considering several complex high-dimensional posterior densities and compare its performance to NUTS. Som Dhulipala · Yifeng Che · Michael Shields 🔗 - Offline Model-Based Reinforcement Learning for Tokamak Control (Poster) []  Unlocking the potential of nuclear fusion as an energy source would have profound impacts on the world. Nuclear fusion is an attractive energy source since the fuel is abundant, there is no risk of meltdown, and there are no high-level radioactive byproducts \citep{walker2020introduction}. Perhaps the most promising technology for harnessing nuclear fusion as a power source is the tokamak: a device that relies on magnetic fields to confine a torus shaped plasma. While strides are being made to prove that net energy output is possible with tokamaks \citep{meade200950}, there are still crucial control challenges that exist with these devices \citep{humphreys2015novel}. In this work, we focus on learning controls via offline model-based reinforcement learning for DIII-D, a device operated by General Atomics in San Diego, California. This device has been in operation since 1986, during which there have been over one hundred thousand shots'' (runs of the device). We use approximately 15k shots to learn a dynamics model that can predict the evolution of the plasma subject to different actuator settings. This dynamics model can then be used as a simulator that generates experience for the reinforcement learning algorithm to train on. We apply this method to train a controller that uses DIII-D's eight neutral beams to achieve desired $\beta_N$ (the normalized ratio between plasma pressure and magnetic pressure) and differential rotation targets. This controller was then evaluated on the DIII-D device. This work marks one of the first efforts for doing feedback control on a tokamak via a reinforcement learning agent that was trained on historical data alone. Ian Char · Joseph Abbate · Laszlo Bardoczi · Mark Boyer · Youngseog Chung · Rory Conlin · Keith Erickson · Viraj Mehta · Nathan Richner · Egemen Kolemen · Jeff Schneider 🔗 - Decay-aware neural network for event classification in collider physics (Poster) []  The goal of event classification in collider physics is to distinguish signal events of interest from background events to the extent possible to search for new phenomena in nature. We propose a decay-aware neural network based on a multi-task learning technique to effectively address this event classification. The proposed model is designed to learn the domain knowledge of particle decays as an auxiliary task, which is a novel approach to improving learning efficiency in the event classification. Our experiments using simulation data confirmed that an inductive bias was successfully introduced by adding the auxiliary task, and significant improvements in the event classification were achieved compared with boosted decision tree and simple multi-layer perceptron models. Tomoe Kishimoto · Masahiro Morinaga · Masahiko Saito · Junichi Tanaka 🔗 - Phase transitions and structure formation in learning local rules (Poster) []  We study a teacher-student rule learning scenario, where the teacher is determined by a local rule and the student model is a uniform tensor-network attention model. The student model also implements a map from variable-size binary inputs to the latent space $\mathcal{V}=\mathds{R}^d$, where $d$ is the bond dimension of the student model. Using gradient descent learning we find a second-order phase transition in the test error. At the transition we observe a sudden drop in the effective dimension of the mapped training data. We also find that small-effective dimension corresponds to structure formation in the latent space $\mathcal{V}$. Bojan Žunkovič · Enej Ilievski 🔗 - Lyapunov Regularized Forecaster (Poster) []  Turbulent flow prediction plays a crucial role in climate change prediction. Especially, the long-term prediction of turbulent flow is a primary and promising goal for its future development and attracts more attention from researchers. However, because the Navier-Stokes equations on which turbulent flow relies are chaotic systems, imperceptible initial differences can lead to large differences in future states, making the long-term prediction extremely difficult, even for the state-of-the-art turbulence prediction model Turbulent-Flow Net (TF-Net) that introduces a trainable bispectral decomposition and combines the temporal properties of turbulence with spatial modeling. Realizing that the error propagation leads to severe instability over time in long-term prediction, we propose a time-based Lyapunov regularizer to the loss function of TF-Net to avoid training error propagation and improve the trained long-term prediction. The comparison experiment shows that our Lyapunov-regularized forecaster does have more stable long-term predictions. Rong Zheng · Rose Yu 🔗 - Ad-hoc Pulse Shape Simulation using Cyclic Positional U-Net (Poster) []  High-Purity Germanium (HPGe) detectors have been a key technology for rare-event searches, such as neutrinoless double-beta decay and dark matter searches, for many decades. Pulse shape simulation is pivotal to improving the physics reach of these experiments. In this work, we propose a Cyclic Positional U-Net (CPU-Net) to achieve ad-hoc pulse shape simulations with high precision and low latency. Taking the transfer learning approach, CPU-Net translates simulated pulses to detector pulses such that they are indistinguishable. We demonstrate CPU-Net's performance on data taken from a local HPGe detector. Aobo Li 🔗 - Learning Uncertainties the Frequentist Way: Calibration and Correlation in High Energy Physics (Poster) []  In this paper, we present a machine learning framework for performing frequentist maximum likelihood inference with Gaussian uncertainty estimation, which also quantifies the mutual information between the unobservable and measured quantities. This framework uses the Donsker-Varadhan representation of the Kullback-Leibler divergence---parametrized with a novel GaussianAnsatz---to enable a simultaneous extraction of the maximum likelihood values, uncertainties, and mutual information in a single training. We demonstrate our framework by extracting jet energy corrections and resolution factors from a simulation of the CMS detector at the Large Hadron Collider. By leveraging the high-dimensional feature space inside jets, we improve upon the nominal CMS jet resolution by upward of 15. Rikab Gambhir · Jesse Thaler · Benjamin Nachman 🔗 - Molecular Fingerprints for Robust and Efficient ML-Driven Molecular Generation (Poster) []  We propose a novel molecular fingerprint-based variational autoencoder applied for molecular generation on real-world drug molecules. We define more suitable and pharma-relevant baseline metrics and tests, focusing on the generation of diverse, drug-like, novel small molecules and scaffolds. When we apply these molecular generation metrics to our novel model, we observe a substantial improvement in chemical synthetic accessibility (∆SAS = -0.83) and in computational efficiency up to 5.9x in comparison to an existing state-of-the-art SMILES-based architecture. Ruslan Tazhigulov · Joshua Schiller · Jacob Oppenheim · Max Winston 🔗 - Machine Learning for Chemical Reactions \\A Dance of Datasets and Models (Poster) []  Machine Learning (ML) models have proved to be excellent emulators of Density Functional Theory (DFT) calculations for predicting features of small molecular systems. The activation energy is a defining feature of a chemical reaction, but despite the success of ML in computational chemistry, an accurate, fast, and general ML-calculator for Minimal Energy Paths (MEPs) has not yet been proposed. Here, we summarize contributions from two of our recent papers, where we apply Graph Neural Network (GNN) based models, trained on various datasets, as potentials for the Nudged Elastic Band (NEB) algorithm to speed up MEP-search. We show that relevant data from reactive regions of the Potential Energy Surface (PES) in training data is paramount to success. Hitherto popular benchmark datasets primarily contain configurations in, or close to, equilibrium, and are not adequate for the task. We propose a new dataset, Transition1x, that contains force and energy calculations for 10 million molecular configurations from on and around MEPs of 10.000 organic reactions of various types. By training GNNs on Transition1x and applying the models as PES-evaluators for NEB, we achieve a Mean Average Error (MAE) of 0.13 eV on predicted activation energies of unseen reactions, compared to DFT, while running the algorithm 1700 times faster. Transition1x is a challenging dataset containing a new type of data that may serve as a benchmark for future methods for transition-state search. Mathias Schreiner · Arghya Bhowmik · Tejs Vegge · Jonas Busk · Peter Bjørn Jørgensen · Ole Winther 🔗 - ML4LM: Machine Learning for Safely Landing on Mars (Poster) []  Human missions to Mars will require rocket-powered descent through the Martian atmosphere to safely land. Designing the propulsion system for these missions introduces the following challenges: ensuring safety, sparsity of data to validate models, and requirements for rapid simulations. ML offers opportunities for addressing these challenges. We use ML methods to develop novel data-analytic tools that support design analysis for enabling supersonic retropropulsion (SRP) technology deployment. Accordingly, we propose a hierarchical physics-embedded data-driven (HPDD) framework for predicting the key target quantity in SRP. HPDD model is trained on small-scale wind tunnel data, and the model exhibits promising accuracy and computational efficiency. Wind tunnel testing in the future will provide more data for validation and enhancement of our framework to further the understanding of SRP. David Wu · Wai Tong Chung · Matthias Ihme 🔗 - Flexible learning of quantum states with generative query neural networks (Poster) Deep neural networks are a powerful tool for the characterization of quantum states. Existing networks are typically trained with experimental data gathered from the specific quantum state that needs to be characterized. But is it possible to train a neural network offline and to make predictions about quantum states other than the ones used for the training? Here we introduce a model of network that can be trained with classically simulated data from a fiducial set of states and measurements, and can later be used to characterize quantum states that share structural similarities with the states in the fiducial set. With little guidance of quantum physics, the network builds its own data-driven representation of quantum states, and then uses it to predict the outcome statistics of quantum measurements that have not been performed yet. The state representation produced by the network can also be used for tasks beyond the prediction of outcome statistics, including clustering of quantum states and identification of different phases of matter. Our network model provides a flexible approach that can be applied to online learning scenarios, where predictions must be generated as soon as experimental data become available, and to blind learning scenarios where the learner has only access to an encrypted description of the quantum hardware. Yan Zhu · Ya-Dong Wu · Ge Bai · Dong-Sheng Wang · Yuexuan Wang · Giulio Chiribella 🔗 - Transfer Learning with Physics-Informed Neural Networks for Efficient Simulation of Branched Flows (Poster) []  Physics-Informed Neural Networks (PINNs) offer a promising approach to solvingdifferential equations and, more generally, to applying deep learning to problemsin the physical sciences. We adopt a recently developed transfer learning approachfor PINNs and introduce a multi-head model to efficiently obtain accurate solutionsto nonlinear systems of differential equations. In particular, we apply the methodto simulate stochastic branched flows, a universal phenomenon in random wavedynamics. We compare the results achieved by feed forward and GAN-basedPINNs on two physically relevant transfer learning tasks and show that our methodsprovide significant computational speedups in comparison to standard PINNstrained from scratch. Raphael Pellegrin · Blake Bullwinkel · Marios Mattheakis · Pavlos Protopapas 🔗 - Decorrelation with Conditional Normalizing Flows (Poster) []  The sensitivity of many physics analyses can be enhanced by constructing discriminants that preferentially select signal events. Such discriminants become much more useful if they are uncorrelated with a set of protected attributes. In this paper we show that a normalizing flow conditioned on the protected attributes can be used to find a decorrelated representation for any discriminant. As a normalizing flow is invertible the separation power of the resulting discriminant will be unchanged at any fixed value of the protected attributes. We demonstrate the efficacy of our approach by building supervised jet taggers that produce almost no sculpting in the mass distribution of the background. Samuel Klein · Tobias Golling 🔗 - A New Task: Deriving Semantic Class Targets for the Physical Sciences (Poster) []  We define deriving semantic class targets as a novel multi-modal task. By doing so, we aim to improve classification schemes in the physical sciences which can be severely abstracted and obfuscating. We address this task for upcoming radio astronomy surveys and present the derived semantic radio galaxy morphology class targets. Micah Bowles 🔗 - Machine-learned climate model corrections from a global storm-resolving model (Poster) []  Due to computational constraints, running global climate models (GCMs) for many years requires a lower spatial grid resolution (>50 km) than is optimal for accurately resolving important physical processes. Such processes are approximated in GCMs via subgrid parameterizations, which contribute significantly to the uncertainty in GCM predictions. One approach to improving the accuracy of a coarse-grid global climate model is to add machine-learned state-dependent corrections at each simulation timestep, such that the climate model evolves more like a high-resolution global storm-resolving model (GSRM). We train neural networks to learn the state-dependent temperature, humidity, and radiative flux corrections needed to nudge a 200 km coarse-grid climate model to the evolution of a 3 km fine-grid GSRM. When these corrective ML models are coupled to a year-long coarse-grid climate simulation, the time-mean spatial pattern errors are reduced by 6-25% for land surface temperature and 9-25% for land surface precipitation with respect to a no-ML baseline simulation. The ML-corrected simulations develop other biases in climate and circulation that differ from, but have comparable amplitude to, the baseline simulation. Anna Kwa 🔗 - Amortized Bayesian Inference for Supernovae in the Era of the Vera Rubin Observatory Using Normalizing Flows (Poster) []  The Vera Rubin Observatory, set to begin observations in mid-2024, will increase our discovery rate of supernovae to well over one million annually. There has been a significant push to develop new methodologies to identify, classify and ultimately understand the millions of supernovae discovered with the Rubin Observatory. Here, we present the first simulation-based inference method using normalizing flows, trained to rapidly infer the parameters of toy supernovae model in multivariate, Rubin-like datastreams. We find that our method is well-calibrated compared to traditional inference methodologies (specifically MCMC), requiring only 1/10,000th of the CPU hours during test time. Victoria Villar 🔗 - Scalable Bayesian Inference for Finding Strong Gravitational Lenses (Poster) []  Finding strong gravitational lenses in astronomical images allows us to assess cosmological theories and understand the large-scale structure of the universe. Previous works on lens detection do not quantify uncertainties in lens parameter estimates or scale to modern surveys. We present a fully amortized Bayesian procedure for lens detection that overcomes these limitations. Unlike traditional variational inference, in which training minimizes the reverse Kullback-Leibler (KL) divergence, our method is trained with an expected forward KL divergence. Using synthetic GalSim images and real Sloan Digital Sky Survey (SDSS) images, we demonstrate that amortized inference trained with the forward KL produces well-calibrated uncertainties in both lens detection and parameter estimation. Yash Patel · Jeffrey Regier 🔗 - Training physical networks like neural networks: deep physical neural networks (Poster) []  Deep neural networks (DNNs) are increasingly used to predict physical processes. Here, we invert this relationship, and show that physical processes with adjustable physical parameters (e.g., geometry, voltages) can be trained to emulate DNNs, i.e., to perform machine learning inference tasks. We call these trainable processes \textit{physical neural networks} (PNNs). We train experimental PNNs based on broadband optical pulses propagating in a nonlinear crystal, a nonlinear electronic oscillator, and an oscillating metal plate. As an extension of these laboratory proof-of-concepts, we train (in simulation) a network of coupled oscillators to perform Fashion MNIST classification. Since one cannot apply autodifferentiation directly to physical processes, we introduce a technique that uses a simulation model to efficiently estimate the gradients of the physical system, allowing us to use backpropagation to train PNNs. Using this technique, we train each system's physical transformations (which do not necessarily resemble typical DNN layers) directly to perform inference calculations. Our work may help inspire novel neural network architectures, including ones that can be efficiently realized with particular physical processes, and presents a route to training complex physical systems to take on desired physical functionalities, such as computational sensing. This article is intended as a summary of the previously published work [Wright, Onodera et al., 2022] for the NeurIPS 2022 Machine Learning and the Physical Sciences workshop. Logan Wright · Tatsuhiro Onodera · Martin M Stein · Tianyu Wang · Darren Schachter · Zoey Hu · Peter McMahon 🔗 - A Curriculum-Training-Based Strategy for Distributing Collocation Points during Physics-Informed Neural Network Training (Poster) []  Physics-informed Neural Networks (PINNs) often have, in their loss functions,terms based on physical equations and derivatives. In order to evaluate these terms, the output solution is sampled using a distribution of collocation points. However, density-based strategies, in which the number of collocation points over the domain increases throughout the training period, do not scale well to multiple spatial dimensions. To remedy this issue, we present here a curriculum-training-based method for lightweight collocation point distributions during network training. We apply this method to a PINN which recovers a full two-dimensional magnetohydrodynamic (MHD) solution from a partial sample taken from a baseline MHD simulation. We find that the curriculum collocation point strategy leads to a significant decrease in training time and simultaneously enhances the quality of the reconstructed solution. Marcus Münzer · Christopher Bard 🔗 - Learning latent variable evolution for the functional renormalization group (Poster) []  We perform a data-driven dimensionality reduction of the 4-point vertex function characterizing the functional Renormalization Group (fRG) flow for the widely studied two-dimensional t-t' Hubbard model on the square lattice. We show that a deep learning architecture based on a Neural Ordinary Differential Equations efficiently learns the evolution of low-dimensional latent variables in all relevant magnetic and d-wave superconducting regimes of the Hubbard model. Ultimately, our work uses an encoder-decoder architecture to extract compact representations of the 4-point vertex functions for correlated electrons, a goal of utmost importance for the success of cutting-edge methods for tackling the many-electron problem. Matija Medvidović · Alessandro Toschi · Giorgio Sangiovanni · Cesare Franchini · Andy Millis · Anirvan Sengupta · Domenico Di Sante 🔗 - Deformations of Boltzmann Distributions (Poster) []  Consider a one-parameter family of Boltzmann distributions $p_t(x) = \tfrac{1}{Z_t}e^{-S_t(x)}$. This work studies the problem of sampling from $p_{t_0}$ by first sampling from $p_{t_1}$ and then applying a transformation $\Psi_{t_1}^{t_0}$ so that the transformed samples follow $p_{t_0}$. We derive an equation relating $\Psi$ and the corresponding family of unnormalized log-likelihoods $S_t$. The utility of this idea is demonstrated on the $\phi^4$ lattice field theory by extending its defining action $S_0$ to a family of actions $S_t$ and finding a $\tau$ such that normalizing flows perform better at learning the Boltzmann distribution $p_\tau$ than at learning $p_0$. Bálint Máté · François Fleuret 🔗 - Neuro-Symbolic Partial Differential Equation Solver (Poster) []  We present a highly scalable strategy for developing mesh-free neuro-symbolic partial differential equation solvers from existing numerical discretizations found in scientific computing. This strategy is unique in that it can be used to efficiently train neural network surrogate models for the solution functions and the differential operators, while retaining the accuracy and convergence properties of state-of-the-art numerical solvers. This neural bootstrapping method is based on minimizing residuals of discretized differential systems on a set of random collocation points with respect to the trainable parameters of the neural network, achieving unprecedented resolution and optimal scaling for solving physical and biological systems. Pouria Akbari Mistani · Samira Pakravan · Rajesh Ilango · Sanjay Choudhry · Frederic Gibou 🔗 - Generating astronomical spectra from photometry with conditional diffusion models (Poster) []  A trade-off between speed and information controls our understanding of astronomical objects. Fast-to-acquire photometric observations provide global properties, while costly and time-consuming spectroscopic measurements enable a better understanding of the physics governing their evolution. Here, we tackle this problem by generating spectra directly from photometry, through which we obtain an estimate of an object's intricacies from easily acquired images. This is achieved by using multimodal conditional diffusion models, where the best out of the generated spectra is selected with a contrastive network. Initial experiments on minimally processed SDSS galaxy data show promising results. Lars Doorenbos · Stefano Cavuoti · Giuseppe Longo · Massimo Brescia · Raphael Sznitman · Pablo Márquez Neila 🔗 - Identifying AGN host galaxies with convolutional neural networks (Poster) []  Active galactic nuclei (AGN) are supermassive black holes with luminous accretion disks found in some galaxies, and are thought to play an important role in galaxy evolution. However, traditional optical spectroscopy for identifying AGN requires time-intensive observations. We train a convolutional neural network (CNN) to distinguish AGN host galaxies from non-active galaxies using a sample of 210,000 Sloan Digital Sky Survey galaxies. We test the CNN on 33,000 galaxies that are spectrally classified as composites, and find correlations between galaxy appearances and their CNN classifications, which hint at evolutionary processes that effect both galaxy morphology and AGN activity. With the advent of the Vera C. Rubin Observatory, Nancy Grace Roman Space Telescope, and other wide-field imaging telescopes, deep learning methods will be instrumental for quickly and reliably shortlisting AGN samples for future analyses. Ziting Guo · John Wu · Chelsea Sharon 🔗 - Efficiently Moving Instead of Reweighting Collider Events with Machine Learning (Poster) []  There are many cases in collider physics and elsewhere where a calibration dataset is used to predict the known physics and / or noise of a target region of phase space. This calibration dataset usually cannot be used out-of-the-box but must be tweaked, often with conditional importance weights, to be maximally realistic. Using resonant anomaly detection as an example, we compare a number of alternative approaches based on transporting events with normalizing flows instead of reweighting them. We find that the accuracy of the morphed calibration dataset depends on the degree to which the transport task is set up to carry out optimal transport, which motivates future research into this area. Radha Mastandrea · Benjamin Nachman 🔗 - D-optimal neural exploration of nonlinear physical systems (Poster) []  Exploring an unknown physical environment in a sample-efficient and computationally fast manner is a challenging task. In this work, we introduce an exploration policy based on neural networks and experimental design. Our policy maximizes the one-step-ahead information gain on the model, which is computed using automatic differentiation, and leads us to an online exploration algorithm requiring small computing resources. We test our method on a number of nonlinear physical systems covering different settings. Matthieu Blanke · marc lelarge 🔗 - Machine learning for complete intersection Calabi-Yau manifolds (Poster) []  We describe the recent developments in using machine learning techniques to compute Hodge numbers of complete intersection Calabi-Yau (CICY) 3- and 4-folds. The main motivation is to understand how to study data from algebraic geometry and solve problems relevant for string theory with machine learning. We describe the state-of-the art methods which reach near-perfect accuracy for several Hodge numbers, and discuss extrapolating from low to high Hodge numbers, and conversely. Harold Erbin · Mohamed Tamaazousti · Riccardo Finotello 🔗 - SuNeRF: Validation of a 3D Global Reconstruction of the Solar Corona Using Simulated EUV Images (Poster) []  Extreme Ultraviolet (EUV) light emitted by the Sun impacts satellite operations and communications and affects the habitability of planets. Currently, EUV-observing instruments are constrained to viewing the Sun from its equator (i.e., ecliptic), limiting our ability to forecast EUV emission for other viewpoints (e.g. solar poles), and to generalize our knowledge of the Sun-Earth system to other host stars. In this work, we adapt Neural Radiance Fields (NeRFs) to the physical properties of the Sun and demonstrate that non-ecliptic viewpoints could be reconstructed from observations limited to the solar ecliptic.To validate our approach, we train on simulations of solar EUV emission that provide a ground truth for all viewpoints. Our model accurately reconstructs the simulated 3D structure of the Sun, achieving a peak signal-to-noise ratio of 43.3 dB and a mean absolute relative error of 0.3\% for non-ecliptic viewpoints.Our method provides a consistent 3D reconstruction of the Sun from a limited number of viewpoints, thus highlighting the potential to create a virtual instrument for satellite observations of the Sun. Its extension to real observations will provide the missing link to compare the Sun to other stars and to improve space-weather forecasting. Kyriaki-Margarita Bintsi · Robert Jarolim · Benoit Tremblay · Miraflor Santos · Anna Jungbluth · James Mason · Sairam Sundaresan · Angelos Vourlidas · Cooper Downs · Ronald Caplan · Andres Munoz-Jaramillo 🔗 - Generating Calorimeter Showers as Point Clouds (Poster) []  In particle physics, precise simulations are necessary to enable scientific progress. However, accurate simulations of the interaction processes in calorimeters are complex and computationally very expensive, demanding a large fraction of the available computing resources in particle physics at present. Various generative models have been proposed to reduce this computational cost. Usually, these models interpret calorimeter showers as 3D images in which each active cell of the detector is represented as a voxel. This approach becomes difficult for high-granularity calorimeters due to the larger sparsity of the data. In this study, we use this sparseness to our advantage and interpret the calorimeter showers as point clouds. More precisely, we consider each hit as part of a hit distribution depending on a global latent calorimeter shower distribution. A first model to learn calorimeter showers as point clouds is presented. The model is evaluated on a high granular calorimeter dataset. Simon Schnake · Dirk Krücker · Kerstin Borras 🔗 - Physics solutions for privacy leaks in machine learning (Poster) []  We show that tensor networks, widely used for providing efficient representations of quantum many-body systems and which have recently been proposed as machine learning architectures, have especially prospective properties for privacy-preserving machine learning. First, we describe a new privacy vulnerability in feedforward neural networks, illustrating it in synthetic and real-world datasets. Then, we develop well-defined conditions to guarantee robustness to such vulnerability, and we rigorously prove that these conditions are satisfied by tensor networks. We supplement the analytical findings with practical examples where matrix product states are trained on datasets of medical records, showing large reductions on the probability of an attacker extracting information about the training dataset from the model's parameters when compared to feedforward neural networks. Alejandro Pozas-Kerstjens · Senaida Hernandez-Santana · José Ramón Pareja Monturiol · Marco Castrillon Lopez · Giannicola Scarpa · Carlos E. Gonzalez-Guillen · David Perez-Garcia 🔗 - From Particles to Fluids: Dimensionality Reduction for Non-Maxwellian Plasma Velocity Distributions Validated in the Fluid Context (Poster) []  Gases and plasmas can be modeled in both a statistical sense (as a collection of discrete particles) and a continuum sense (as a continuous distribution). A collection of discrete particles is often modeled using a Maxwellian velocity distribution, which is useful in many scenarios but limited by the assumption of thermal equilibrium. In this work, we develop an architecture to learn a low-dimensional, general parameterization of the velocity distribution from scientific instrument plasma data. Such parameterizations have direct applications in data compression and simplified downstream learning algorithms. We verify that this dimensionally-reduced distribution preserves the key underlying physics of the data after reconstruction, specifically looking at the fluid parameters as derived from the instrument plasma moments (e.g., density, velocity, temperature). Finally, we present evidence for an information bottleneck arising from the relationship between the number of reduced parameters and the quality of reconstructed fluid parameters. Applying this learned architecture to data compression, we achieved a 30X compression ratio with what were deemed as acceptable losses. Daniel da Silva 🔗 - Simplifying Polylogarithms with Machine Learning (Poster) []  In particle physics calculations are centered around Feynman integrals, which are commonly expressed using polylogarithmic functions such as the logarithm or the dilogarithm. Although the resulting expressions usually simplify with an astute application of polylogarithmic identities, it is often difficult to know which identities to apply and in what order. We explore the extent to which machine learning methods are able to help with this creative step. We implement two simplification strategies, one based on an intuitive application of reinforcement learning and one showcasing the potential of language models such as transformers. We demonstrate that the transformer approach is more flexible and holds promise for practical use in symbolic manipulation tasks relevant to mathematical physics. Aurelien Dersy · Matthew Schwartz · Xiaoyuan Zhang 🔗 - NLP Inspired Training Mechanics For Modeling Transient Dynamics (Poster) []  In recent years, Machine learning (ML) techniques developed for Natural Language Processing (NLP) have permeated into developing better computer vision algorithms. In this work, we use such NLP-inspired techniques to improve the accuracy, robustness and generalizability of ML models for simulating transient dynamics. We introduce teacher forcing and curriculum learning based training mechanics to model vortical flows and show an enhancement in accuracy for ML models, such as FNO and UNet by more than 50%. Lalit Ghule · Rishikesh Ranade · Jay Pathak 🔗 - Neural Network-based Real-Time Parameter Estimation in Electrochemical Sensors with Unknown Confounding Factors (Poster) []  Real-time parameter estimation from measurements in electrochemical sensors remains a challenge. Traditional methods used to characterize the response and estimate parameters of interest from electrochemical sensors are often slow and time-consuming, thus, not applicable for real-time applications. Here, we develop a workflow utilizing physics-based processing and deep learning to estimate parameters and confounding variables with uncertainties in real-time from large amplitude AC Voltammetry (LA-ACV) measurements on electrochemical sensors. The physics-based processing enables the extraction of physical information about the system from the measurement data, and deep learning enables rapid inverse-problem solutions. We experimentally demonstrate our approach in an electrochemical system (K3Fe(CN)6 in potassium phosphate buffer) to estimate the concentration of redox-active species (K3Fe(CN)6) in the presence of unknown viscosity of the medium (confounding variable), with 0.45 (± 0.07) mM median absolute error in concentration estimation. The proposed workflow leveraging physics-based processing and deep learning can be applied reproducibly to any electrochemical system for real-time parameter estimation. Sarthak Jariwala 🔗 - Learning dynamical systems: an example from open quantum system dynamics. (Poster) Machine learning algorithms designed to learn dynamical systems from data can be used to forecast, control and interpret the observed dynamics. In this abstract we exemplify the use of one of such algorithms, namely Koopman operator learning, in the context of open quantum system dynamics. We will study the dynamics of a spin chain coupled with dephasing gates and show how Koopman operator learning is an approach to efficiently learn not only the evolution of the density matrix, but also of {\em every phyisical observable} associated to the system. Finally, using the spectral decomposition of the learned Koopman operator, we show how symmetries obeyed by the underlying dynamics can be inferred directly from data. Pietro Novelli 🔗 - Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators (Poster) []  Recent advancements in self-supervised learning and transfer learning methods have popularized approaches that involve pretraining models from massive data sources and subsequent finetuning of such models towards a specific task. While such approaches have become the norm in fields such as natural language processing, implementation and evaluation of transfer learning approaches for chemistry are in the early stages. In this work, we demonstrate finetuning for downstream tasks on a graph neural network (GNN) trained over a molecular database containing 2.7 million water clusters. The use of Graphcore IPUs as an AI accelerator for training molecular GNNs reduces training time from a reported 2.7 days on 0.5M clusters to 92 minutes on 2.7M clusters. Finetuning the pretrained model for downstream tasks of molecular dynamics and level-of-theory transfer took only 8.3 hours and 28 minutes, respectively, on a single GPU. Jenna A Bilbrey · Kristina Herman · Henry Sprueill · Sotiris Xantheas · Payel Das · Manuel Lopez Roldan · Mike Kraus · Hatem Helal · Sutanay Choudhury 🔗 - Emulating Fast Processes in Climate Models (Poster) []  Cloud microphysical parameterizations in atmospheric models describe the formation and evolution of clouds and precipitation, a central weather and climate process. Cloud-associated latent heating is a primary driver of large and small-scale circulations throughout the global atmosphere, and clouds have important interactions with atmospheric radiation. Clouds are ubiquitous, diverse, and can change rapidly. In this work, we build the first emulator of an entire cloud microphysical parameterization, including fast phase changes. The emulator performs well in offline and online (i.e. when coupled to the rest of the atmospheric model) tests, but shows some developing biases in Antarctica. Sensitivity tests demonstrate that these successes require careful modeling of the mixed discrete-continuous output as well as the input-output structure of the underlying code and physical process. Noah Brenowitz · W. Andre Perkins · Jacqueline M. Nugent · Oliver Watt-Meyer · Spencer K. Clark · Anna Kwa · Brian Henn · Jeremy McGibbon · Christopher S. Bretherton 🔗 - GAUCHE: A Library for Gaussian Processes in Chemistry (Poster) []  We introduce GAUCHE, a library for GAUssian processes in CHEmistry. Gaussian processes have long been a cornerstone of probabilistic machine learning, affording particular advantages for uncertainty quantification and Bayesian optimisation. Extending Gaussian processes to chemical representations however is nontrivial, necessitating kernels defined over structured inputs such as graphs, strings and bit vectors. By defining such kernels in GAUCHE, we seek to open the door to powerful tools for uncertainty quantification and Bayesian optimisation in chemistry. Motivated by scenarios frequently encountered in experimental chemistry, we showcase applications for GAUCHE in molecular discovery and chemical reaction optimisation. Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Bojana Rankovic · Yuanqi Du · Arian Jamasb · Julius Schwartz · Austin Tripp · Gregory Kell · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Alpha Lee · Philippe Schwaller · Jian Tang 🔗 - One-shot learning for solution operators of partial differential equations (Poster) []  Discovering governing equations of a physical system, represented by partial differential equations (PDEs), from data is a central challenge in a variety of areas of science and engineering. Current methods require either some prior knowledge (e.g., candidate PDE terms) to discover the PDE form, or a large dataset to learn a surrogate model of the PDE solution operator. Here, we propose the first solution operator learning method that only needs one PDE solution, i.e., one-shot learning. We first decompose the entire computational domain into small domains, where we learn a local solution operator, and then we find the coupled solution via either mesh-based fixed-point iteration or meshfree local-solution-operator informed neural networks. We demonstrate the effectiveness of our method on different PDEs, and our method exhibits a strong generalization property. Lu Lu · Anran Jiao · Jay Pathak · Rishikesh Ranade · Haiyang He 🔗 - Wavelets Beat Monkeys at Adversarial Robustness (Poster) []  Research on improving the robustness of neural networks to adversarial noise - imperceptible malicious perturbations of the data - has received significant attention. Neural nets struggle to recognize corrupted images that are easily recognized by humans. The currently uncontested state-of-the-art defence to obtain robust deep neural networks is Adversarial training (AT), but it consumes significantly more resources compared to standard training and trades off accuracy for robustness.An inspiring recent work \citep{dapello2020simulating} aims to bring neurobiological tools to the question: How can we develop Neural Nets that robustly generalize like human vision? They design a network structure with a neural hidden first layer that mimics the primate primary visual cortex (V1), followed by a back-end structure adapted from current CNN vision models. This front-end layer, called VOneBlock, consists of a biologically inspired Gabor Filter Bank with fixed handcrafted "biologically constrained" weights, simple and complex cell non-linearities and a "V1 stochasticity generator” injecting randomness. It seems to achieve non-trivial adversarial robustness on standard vision benchmarks when tested on small perturbations.Here we revisit this biologically inspired work, which heavily relies on handcrafted tuning of the parameters of the V1 unit based on neural responses derived from experimental records of macaque monkeys. We ask whether a principled parameter-free representation with inspiration from physics is able to achieve the same goal. We discover that the wavelet scattering transform can replace the complex V1-cortex and simple uniform Gaussian noise can take the role of neural stochasticity, to achieve adversarial robustness.In extensive experiments on the CIFAR-10 benchmark with adaptive adversarial attacks we show that: 1) Robustness of VOneBlock architectures is relatively weak (though non-zero) when the strength of the adversarial attack radius is set to commonly used benchmarks. 2) Replacing the front-end VOneBlock by an off-the-shelf parameter-free Scatternet followed by simple uniform Gaussian noise can achieve much more substantial adversarial robustness without adversarial training. Our work shows how physically inspired structures yield new insights into robustness that were previously only thought possible by meticulously mimicking the human cortex.Physics, rather than only neuroscience, can guide us towardsmore robust neural networks. Jingtong Su · Julia Kempe 🔗 - Qubit seriation: Undoing data shuffling using spectral ordering (Poster) []  With the advent of quantum and quantum-inspired machine learning, adaptingthe structure of learning models to match the structure of target datasets has beenshown to be crucial for obtaining high performance. Probabilistic models basedon tensor networks (TNs) are prime candidates to benefit from data-dependentdesign considerations, owing to their bias towards correlations that are localwith respect to the topology of the model. In this work, we use methods fromspectral graph theory to search for optimal permutations of model sites that areadapted to the structure of an input dataset. Our method uses pair-wise mutualinformation estimates from the target dataset to ensure that strongly correlated bitsare placed closer to each other relative to the model’s topology. We demonstratethe effectiveness of such prepossessing for probabilistic modeling tasks, findingsubstantial improvements in the performance of generative models based on matrixproduct states (MPS) across a variety of datasets. We also show how spectralembedding, a dimensionality reduction technique from spectral graph theory, canbe used to gain further insights into the structure of datasets of interest. Atithi Acharya · Manuel Rudolph · Jing Chen · Jacob Miller · Alejandro Perdemo-Ortiz 🔗 - First principles physics-informed neural network for quantum wavefunctions and eigenvalue surfaces (Poster) []  Physics-informed neural networks have been widely applied to learn general parametric solutions of differential equations. Here, we propose a neural network to discover parametric eigenvalue and eigenfunction surfaces of quantum systems. We apply our method to solve the hydrogen molecular ion. This is an ab initio deep learning method that solves the Schrödinger equation with the Coulomb potential yielding realistic wavefunctions that include a cusp at the ion positions. The neural solutions are continuous and differentiable functions of the interatomic distance and their derivatives are analytically calculated by applying automatic differentiation. Such a parametric and analytical form of the solutions is useful for further calculations such as the determination of force fields Marios Mattheakis · Gabriel R. Schleder · Daniel Larson · Efthimios Kaxiras 🔗 - Clustering Behaviour of Physics-Informed Neural Networks: Inverse Modeling of An Idealized Ice Shelf (Poster) []  We investigate the use of Physics-Informed Neural Networks (PINNs) for ice shelf hardness inversion, focusing on the effect of the relative weighting between equation and data components in the PINN objective function on its predictive performance. In the objective function we use a hyperparameter gamma which adjusts the relative priority given to the fit of the PINN to known physical laws and its fit to the training data. We train the PINN with a range of gamma, and training data with varying magnitudes of injected noise. We find that the PINN solutions converge to two different clusters in the prediction error space; one cluster corresponds to accurate, "low-error" solutions, while the other consists of "high-error" solutions that were likely trapped in a local minimum of the PINN objective function and fit poorly to the ground truth datasets. We call this the PINN clustering behaviour, which persists for a wide range of gamma, noise level, and even with clean data. Using k-means clustering, we filter out the PINN solutions in the high-error clusters. The accuracy of the solutions in the low-error cluster varies with gamma and the data noise. We find that the value of $\gamma$ that minimizes the error of PINN-predicted ice hardness varies significantly with the data noise. With the optimal choice of gamma, the PINN can remove the noise in the data and successfully predict the noise-free velocity, thickness and the ice hardness. The clustering phenomenon is observed for a wide range of parameter settings and is of practical, as well as theoretical interest. Yunona Iwasaki · Ching-Yao Lai 🔗 - Renormalization in the neural network-quantum field theory correspondence (Poster) []  A statistical ensemble of neural networks can be described in terms of a quantum field theory (NN-QFT correspondence). The infinite-width limit is mapped to a free field theory, while finite N corrections are mapped to interactions. After reviewing the correspondence, we will describe how to implement renormalization in this context and discuss preliminary numerical results for translation-invariant kernels. A major outcome is that changing the standard deviation of the neural network weight distribution corresponds to a renormalization flow in the space of networks. Harold Erbin · Vincent Lahoche · Dine Ousmane Samary 🔗 - Applying Deep Reinforcement Learning to the HP Model for Protein Structure Prediction (Poster) []  A central problem in computational biophysics is protein structure prediction, i.e., finding the optimal folding of a given amino acid sequence. This problem has been studied in a classical abstract model, the HP model, where the protein is modeled as a sequence of H (hydrophobic) and P (polar) amino acids on a lattice. The objective is to find conformations maximizing H-H contacts. It is known that even in this reduced setting, the problem is intractable (NP-hard). In this work, we apply deep reinforcement learning (DRL) to the two-dimensional HP model. We can obtain the best known conformations for benchmark HP sequences with lengths from 20 to 50. Our DRL is based on a deep Q-network (DQN). We find that a DQN based on long short-term memory (LSTM) architecture greatly enhances the RL learning ability and significantly improves the search process. DRL can sample the state space efficiently, without the need of manual heuristics. Experimentally we show that it can find multiple distinct best-known solutions per trial. This study demonstrates the effectiveness of deep reinforcement learning in the HP model for protein folding. Kaiyuan Yang · Houjing Huang · Olafs Vandans · Adithyavairavan Murali · Fujia Tian · Roland Yap · Liang Dai 🔗 - Intra-Event Aware Imitation Game for Fast Detector Simulation (Poster) []  While realistic detector simulations are an essential component of particle physics experiments, current methods are computationally inefficient, requiring significant resources to produce, store, and distribute simulation data.In this work, we propose the Intra-Event Aware GAN (IEA-GAN), a deep generative model which allows for faster and more resource-efficient simulations. We demonstrate its use in generating sensor-dependent images for the Pixel Vertex Detector (PXD) at the Belle II Experiment, the sub-detector with the highest spatial resolution. We show that using the domain-specific relational inductive bias introduced by our Relational Reasoning Module, one can approximate the concept of a collision event in the detector simulation. We also propose a Uniformity loss to maximize the information entropy of the IEA-GAN discriminator's knowledge and an Intra-Event Aware loss for the generator to imitate the discriminator's dyadic class-to-class knowledge. We show that the IEA-GAN not only captures the fine-grained semantic and statistical similarity between the images but also finds correlations among them, leading to a significant improvement in image fidelity and diversity compared to the previous state-of-the-art models. Hosein Hashemi · Nikolai Hartmann · Sahand Sharifzadeh · James Kahn · Thomas Kuhr 🔗 - Deep Learning Modeling of Subgrid Physics in Cosmological N-body Simulations (Poster) []  Calculating N-body simulations has been an extremely time and resource consuming process for researchers. There have been many schemes trying to approximate the correct positions of celestial objects of simulations. In this paper, we propose using Neural Networks in the physical and in the Fourier domain in order to correct a Particle Mesh scheme, primarily in the smaller scales i.e., the smaller details of the N-body simulations. In addition, we used a recently proposed in the literature technique to train our models i.e. through an Ordinary Differential Equations (ODE) solver. We present our promising results of the different types of Neural Networks that we experimented with. Georgios Markos Chatziloizos · Francois Lanusse · Tristan Cazenave 🔗 - Combinational-convolution for flow-based sampling algorithm (Poster) []  We propose a new class of efficient layer called {\it CombiConv} (Combinational-convolution) that improves the acceptance rate for the flow-based sampling algorithm for quantum field theory on the lattice. CombiConv is made from a $d$-dimensional convolution out of lower $k$-dimensional $\mycomb{d}{k}$ convolutions and combining their outputs, and CombiConv has fewer parameters than the standard convolutions.We apply CombiConv to the flow-based sampling algorithm,Furthermore, we find that for every $d=2,3,4$-dimensional scalar $\phi^4$ theory CombiConv for $k=1$ achieves a higher acceptance rate than others. Akio Tomiya 🔗 - Point Cloud Generation using Transformer Encoders and Normalising Flows (Poster) []  Data generation based on Machine Learning has become a major research topic in particle physics. This is due to the current Monte Carlo simulation approach being computationally challenging for future colliders, which will have a significantly higher luminosity. The generation of collider data is similar to point cloud generation, but arguably more difficult as there are complex correlations between the points which need to be modelled correctly. A refinement model consisting of normalising flows and transformer encoders is presented. The normalising flow output is corrected by a transformer encoder, which is adversarially trained against another transformer encoder. The model reaches state-of-the-art results with a lightweight model architecture which is stable to train. Benno Käch · Dirk Krücker · Isabell Melzer 🔗 - Learning Similarity Metrics for Volumetric Simulations with Multiscale CNNs (Poster) []  We propose a similarity model based on entropy, which allows for the creation of physically meaningful ground truth distances for the similarity assessment of scalar and vectorial data, produced from transport and motion-based simulations. Utilizing two data acquisition methods derived from this model, we create collections of fields from numerical PDE solvers and existing simulation data repositories. Furthermore, a multiscale CNN architecture that computes a volumetric similarity metric (VolSiM) is proposed and its robustness is evaluated on a large range of test data. To the best of our knowledge this is the first learning method inherently designed to address the similarity assessment of high-dimensional simulation data. Georg Kohl · Liwei Chen · Nils Thuerey 🔗 - Stabilization and Acceleration of CFD Simulation by Controlling Relaxation Factor Based on Residues: An SNN Based Approach (Poster) []  Computational Fluid Dynamics (CFD) simulation involves the solution of a sparse system of linear equations. Faster convergence to a physically meaningful CFD simulation result of steady-state physics depends largely on the choice of optimum value of the under-relaxation factor (URF) and continuous manual monitoring of simulation residues. In this paper, we present an algorithm for classifying simulation convergence (or divergence) based on the residues using a spiking neural network (SNN) and a control logic. This algorithm maintains optimum URF throughout the simulation process and ensure accelerated convergence of the simulation. The algorithm is also able to stabilize and bring back a diverging simulation to the converging range automatically without manual intervention. To the best of our knowledge, SNN is used for the first time to solve such complex classification problem and it achieves an accuracy of 92.4% to detect the divergent cases. When tested on two steady-state incompressible CFD problems, our solution is able to stabilize every diverging simulation and accelerate the simulation time by at least 10% compared to a constant value of URF. Sounak Dey · Dighanchal Banerjee · Mithilesh Maurya · Dilshad Ahmad 🔗 - Simulation-based inference of the 2D ex-situ stellar mass fraction distribution of galaxies using variational autoencoders (Poster) []  Galaxies grow through star formation (in-situ) and accretion (ex-situ) of other galaxies. Reconstructing the relative contribution of these two growth channels is crucial for constraining the processes of galaxy formation in a cosmological context. In this on-going work, we utilize a conditional variational autoencoder along with a normalizing flow - trained on a state-of-the-art cosmological simulation - in an attempt to infer the posterior distribution of the 2D ex-situ stellar mass distribution of galaxies solely from observable two-dimensional maps of their stellar mass, kinematics, age and metallicity. Such maps are typically obtained from large Integral Field Unit Surveys such as MaNGA. We find that the average posterior provides an estimate of the resolved accretion histories of galaxies with a mean ∼ 10% error per pixel. We show that the use of a normalizing flow to conditionally sample the latent space results in a smaller reconstruction error. Due to the probabilistic nature of our architecture, the uncertainty of our predictions can also be quantified. To our knowledge, this is the first attempt to infer the 2D ex-situ fraction maps from observable maps. Eirini Angeloudi · Marc Huertas-Company · Jesús Falcón-Barroso · Regina Sarmiento · Daniel Walo-Martín · Annalisa Pillepich · Jesús Vega Ferrero 🔗 - Uncertainty quantification methods for ML-based surrogate models of scientific applications (Poster) []  In recent years, there has been growing interest in using machine-learning algorithms to assist classical numerical methods for scientific computations, as this data-driven approach could reduce computational cost. While faster execution is attractive, accuracy should be preserved. Perhaps more importantly, our ability to identify when a given machine-learning surrogate is not reliable should make their application more robust. We aim to quantify the uncertainty of predictions through the application of Bayesian and ensemble methods. We apply these methods to approximate a paraboloid and then the solution to the wave equation with both standard neural networks and physics-informed neural networks. We demonstrate that the embedding of physics information in neural networks reduces the model uncertainty while improving the accuracy. Between the two uncertainty quantification methods, our results show that the Bayesian neural networks render overconfident results while model outputs from a well-constructed ensemble are appropriately conservative. Kishore Basu · Yujia Hao · Delphine Hintz · Dev Shah · Aaron Palmer · Gurpreet Singh Hora · Darian Nwankwo · Laurent White 🔗 - Contrasting random and learned features in deep Bayesian linear regression (Poster) []  Understanding how feature learning affects generalization is among the foremost goals of modern deep learning theory. Here, we use the replica method from the statistical mechanics of disordered systems to study how the ability to learn representations affects the generalization performance of a simple class of models: deep Bayesian linear neural networks trained on unstructured Gaussian data. By comparing deep random feature models to deep networks in which all layers are trained, we provide a detailed characterization of the interplay between width, depth, data density, and prior mismatch. Random feature models can have particular widths that are optimal for generalization at a given data density, while making neural networks as wide or as narrow as possible is always optimal. Moreover, we show that the leading-order correction to the kernel-limit learning curve cannot distinguish between random feature models and deep networks in which all layers are trained. Taken together, our findings begin to elucidate how architectural details affect generalization performance in this simple class of deep regression models. Jacob Zavatone-Veth · William Tong · Cengiz Pehlevan 🔗 - DS-GPS : A Deep Statistical Graph Poisson Solver (for faster CFD simulations) (Poster) []  This paper proposes a novel Machine Learning-based approach to solve a Poisson problem with mixed boundary conditions. Leveraging Graph Neural Networks, we develop a model able to process unstructured grids with the advantage of enforcing boundary conditions by design. By directly minimizing the residual of the Poisson equation, the model attempts to learn the physics of the problem without the need for exact solutions, in contrast to most previous data-driven processes where the distance with the available solutions is minimized. Matthieu Nastorg 🔗 - Dynamical Mean Field Theory of Kernel Evolution in Wide Neural Networks (Poster) []  We analyze feature learning in infinite-width neural networks trained with gradient flow through a self-consistent dynamical field theory. We construct a collection of deterministic dynamical order parameters which are inner-product kernels for hidden unit activations and gradients in each layer at pairs of time points, providing a reduced description of network activity through training. These kernel order parameters collectively define the hidden layer activation distribution, the evolution of the neural tangent kernel, and consequently output predictions. We provide a sampling procedure to self-consistently solve for the kernel order parameters. Blake Bordelon · Cengiz Pehlevan 🔗 - Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection (Poster) []  In the era of big astronomical surveys, our ability to leverage artificial intelligence algorithms simultaneously for multiple datasets will open new avenues for scientific discovery. Unfortunately, simply training a deep neural network on images from one data domain often leads to very poor performance on any other dataset. Here we develop a Universal Domain Adaptation method DeepAstroUDA, capable of performing semi-supervised domain alignment that can be applied to datasets with different types of class overlap. Extra classes can be present in any of the two datasets, and the method can even be used in the presence of unknown classes. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets (from SDSS and DeCALS). We show that our method is capable of bridging the gap between two astronomical surveys, and also performs well for anomaly detection and clustering of unknown data in the unlabeled dataset. We apply our model to two examples of galaxy morphology classification tasks with anomaly detection: 1) classifying spiral and elliptical galaxies with detection of merging galaxies (three classes including one unknown anomaly class); 2) a more granular problem where the classes describe more detailed morphological properties of galaxies, with the detection of gravitational lenses (ten classes including one unknown anomaly class). Aleksandra Ciprijanovic · Ashia Lewis · Kevin Pedro · Sandeep Madireddy · Brian Nord · Gabriel Nathan Perdue · Stefan Wild 🔗 - A Neural Network Subgrid Model of the Early Stages of Planet Formation (Poster) []  Planet formation is a multi-scale process in which the coagulation of μm-sizeddust grains in protoplanetary disks is strongly influenced by the hydrodynamicprocesses on scales of astronomical units (≈ 1.5 × 10^9 km). Studies are thereforedependent on subgrid models to emulate the micro physics of dust coagulationon top of a large scale hydrodynamic simulation. Numerical simulations whichinclude the relevant physical effects are complex and computationally expensive.Here, we present a fast and accurate learned effective model for dust coagulation,trained on data from high resolution numerical coagulation simulations. Our modelcaptures details of the dust coagulation process that were so far not tractable withother dust coagulation prescriptions with similar computational efficiency Thomas Pfeil · Miles Cranmer · Shirley Ho · Philip Armitage · Tilman Birnstiel · Hubert Klahr 🔗 - Validation Diagnostics for SBI algorithms based on Normalizing Flows (Poster) []  Building on the recent trend of new deep generative models known as Normalizing Flows (NF), simulation-based inference (SBI) algorithms can now efficiently accommodate arbitrary complex and high-dimensional data distributions. The development of appropriate validation methods however has fallen behind. Indeed, most of the existing metrics either require access to the true posterior distribution, or fail to provide theoretical guarantees on the consistency of the inferred approximation beyond the one-dimensional setting. This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF. It also offers theoretical guarantees based on results of local consistency. The proposed workflow can be used to check, analyse and guarantee consistent behavior of the estimator. The method is illustrated with a challenging example that involves tightly coupled parameters in the context of computational neuroscience. This work should help the design of better specified models or drive the development of novel SBI-algorithms, hence allowing to build up trust on their ability to address important questions in experimental science. Julia Linhart · Alexandre Gramfort · Pedro Rodrigues 🔗 - One Network to Approximate Them All: Amortized Variational Inference of Ising Ground States (Poster) []  For a wide range of combinatorial optimization problems, finding the optimal solutions is equivalent to finding the ground states of corresponding Ising Hamiltonians. Recent work shows that these ground states are found more efficiently by variational approaches using autoregressive models than by traditional methods. In contrast to previous works, where for every problem instance a new model has to be trained, we aim at a single model that approximates the ground states for a whole family of Hamiltonians. We demonstrate that autoregregressive neural networks can be trained to achieve this goal and are able to generalize across a class of problems. We iteratively approximate the ground state based on a representation of the Hamiltonian that is provided by a graph neural network. Our experiments show that solving a large number of related problem instances by a single model can be considerably more efficient than solving them individually. Sebastian Sanokowski · Wilhelm Berghammer · Johannes Kofler · Sepp Hochreiter · Sebastian Lehner 🔗 - Hybrid integration of the gravitational N-body problem with Artificial Neural Networks (Poster) []  Studying the evolution of the gravitational N-body problem becomes extremely computationally expensive as the number of bodies increases. In order to alleviate this problem, we study the use of Artificial Neural Networks (ANNs) to substitute expensive parts of the integration of planetary systems. We compare the performance of a Hamiltonian Neural Network (HNN) which includes physics constraints into its architecture with a conventional Deep Neural Network (DNN). We find that HNNs are able to conserve energy better than DNNs in a simplified scenario with two planets, but become challenging to train for a more realistic case, namely when adding asteroids. We develop a hybrid integrator that chooses between the network's prediction and the numerical computation, and show that for a number of asteroids >60, using ANNs improves the computational cost of the simulation while allowing for an accurate reproduction of the trajectory of the bodies. Veronica Saz Ulibarrena · Simon Portegies Zwart · Elena Sellentin · Barry Koren · Philipp Horn · Maxwell X. Cai 🔗 - CAPE: Channel-Attention-Based PDE Parameter Embeddings for SciML (Poster) []  Scientific Machine Learning (SciML) is concerned with the development of machine learning methods for emulating physical systems governed by partial differential equations (PDE). ML-based surrogate models substitute inefficient and often non-differentiable numerical simulation algorithms and find multiple applications such as weather forecasting and molecular dynamics. While a number of ML-based methods for approximating the solutions of PDEs have been proposed in recent years, they typically do not consider the parameters of the PDEs, making it difficult for the ML surrogate models to generalize to PDE parameters not seen during training. We propose a new channel-attention-based parameter embedding (CAPE) component for scientific machine learning models. The CAPE module can be combined with any neural PDE solver allowing it to adapt to unseen PDE parameters without harming the original model’s performance. We compare CAPE using a PDE benchmark and obtain significant improvements over the base models. An implementation of the method and experiments are available at https://anonymous.4open.science/r/CAPE-ML4Sci-145 Makoto Takamoto · Francesco Alesiani · Mathias Niepert 🔗 - Real-time Health Monitoring of Heat Exchangers using Hypernetworks and PINNs (Poster) []  We demonstrate a Physics-informed Neural Network (PINN) based model for real-time health monitoring of a heat exchanger, that plays a critical role in improving energy efficiency of thermal power plants. A hypernetwork based approach is used to enable the domain-decomposed PINN learn the thermal behavior of the heat exchanger in response to dynamic boundary conditions, eliminating the need to re-train. As a result, we achieve orders of magnitude reduction in inference time in comparison to existing PINNs, while maintaining the accuracy on par with the physics-based simulations. This makes the approach very attractive for predictive maintenance of the heat exchanger in digital twin environments. Ritam Majumdar · Vishal Jadhav · Anirudh Deodhar · Shirish Karande · Lovekesh Vig · Venkataramana Runkana 🔗 - Physics-Informed CNNs for Super-Resolution of Sparse Observations on Dynamical Systems (Poster) []  In the absence of high-resolution samples, super-resolution of sparse observations on dynamical systems is a challenging problem with wide-reaching applications in experimental settings. We showcase the application of physics-informed convolutional neural networks for super-resolution of sparse observations on grids. Results are shown for the chaotic-turbulent Kolmogorov flow, demonstrating the potential of this method for resolving finer scales of turbulence when compared with classic interpolation methods, and thus reconstructing missing physics. Daniel Kelshaw · Georgios Rigas · Luca Magri 🔗 - Neural Inference of Gaussian Processes for Time Series Data of Quasars (Poster) []  The study of single-band quasar light curves poses two problems: inference of the power spectrum and interpolation of an irregularly sampled time series. A baseline approach to these tasks is to interpolate a time series with a Damped Random Walk (DRW) model, in which the spectrum is inferred using Maximum Likelihood Estimation (MLE). However, the DRW model does not describe the smoothness of the time series, whereas MLE faces many problems from the theory of optimization and computational math. In this work, we introduce a new stochastic model, that we call Convolved Damped Random Walk (CDRW). This model introduces a concept of smoothness to a Damped Random Walk, which enables it to fully describe quasar spectra. Moreover, we introduce a new method of inference of Gaussian process parameters, which we call Neural inference. This method uses the powers of the state-of-the-art neural networks to improve the conventional MLE inference technique. In our experiments, the Neural inference method results in significant improvement over the baseline MLE (RMSE: 0.318 → 0.205, 0.464 → 0.444). Moreover, the combination of both the CDRW model and the Neural inference significantly outperforms the baseline DRW and MLE in the interpolation of a typical quasar light curve (χ2: 0.333 → 0.998, 2.695 → 0.981). Egor Danilov · Aleksandra Ciprijanovic · Brian Nord 🔗 - Deep Learning-Based Spatiotemporal Multi-Event Reconstruction for Delay-Line Detectors (Poster) []  Accurate observation of two or more particles within a very narrow time window has always been a great challenge in modern physics. It opens the possibility for correlation experiments, as e.g. the important Hanbury Brown-Twiss experiment, leading to new physical insights. For low-energy electrons, one possibility is to use a micro-channel plate with subsequent delay-lines for the readout of the incident particle hits. With such a Delay-Line Detector the spatial and temporal coordinates of more than one particle can be fully reconstructed as soon as both particles have a larger separation than what is called the dead radius. For events where two electrons are closer in space and time, the determination of the individual positions of the particles requires elaborated peak finding algorithms. While classical methods work well with single particle hits, they fail to identify and reconstruct events caused by multiple particles when they arrive close in space and time. To address this challenge, a new spatiotemporal machine learning model is developed to identify and reconstruct the position and time of such multi-hit signals. The model achieves a much better resolution for near-by particle hits compared to the classical approaches, reducing the dead radius by half. This shows that machine learning models can be effective in improving the spatiotemporal performance of Delay-Line Detectors. Marco Knipfer · Sergei Gleyzer · Stefan Meier · Jonas Heimerl · Peter Hommelhoff 🔗 - Tensor networks for active inference with discrete observation spaces (Poster) []  In recent years, quantum physics-inspired tensor networks have seen an explosion in use cases.While these networks were originally developed to model many-body quantum systems, their usage has expanded into the field of machine learning, where they are often used as an alternative to neural networks.In a similar way, the neuroscience-based theory of active inference, a general framework for behavior and learning in autonomous agents, has started branching out into machine learning.Since every aspect of an active inference model, such as the latent space structure, must be manually defined, efforts have been made to learn state space representations automatically from observations using deep neural networks.In this work, we show that tensor networks can be employed to learn an active inference model with a discrete observation space.We demonstrate our method on the T-maze problem and show that the agent acts Bayes optimal as expected under active inference. Samuel T. Wauthier · Bram Vanhecke · Tim Verbelen · Bart Dhoedt 🔗 - Employing CycleGANs to Generate Realistic STEM Images for Machine Learning (Poster) []  Identifying atomic features in aberration-corrected scanning transmission electron microscopy (STEM) data is critical to understanding structures and properties of materials. Machine learning (ML) models have been applied to accelerate these tasks. The training sets for these ML models are typically constructed with codes that provide simulations of STEM images alongside desired labels. However, these simulated images are often limited by the oversimplified model and deviate from the experimental images, limiting the accuracy and precision of ML training. We present an approach to generating realistic STEM images by employing a cycleGAN to automatically add realistic microscopy features and noise profiles to simulated data. We also train a defect-identification neural network using these generated images and evaluate the model on real STEM images to locate atomic defects within them. The application of CycleGAN provides other machine learning models with more realistic training data for any type of supervised learning. Abid Khan · Chia-Hao Lee · Pinshane Y. Huang · Bryan Clark 🔗 - HubbardNet: Efficient Predictions of the Bose-Hubbard Model Spectrum with Deep Neural Networks (Poster) []  We present a deep neural network (DNN)-based model, the HubbardNet, to variationally solve for the ground state and excited state wavefunctions of the one-dimensional and two-dimensional Bose-Hubbard model on a square lattice. Using this model, we obtain the Bose-Hubbard energy spectrum as an analytic function of the Coulomb parameter, U, and the total number of particles, N, from a single training, bypassing the need to solve a new hamiltonian for each different input. We show that the DNN-parametrized solutions have excellent agreement with exact diagonalization while outperforming exact diagonalization in terms of computational scaling, suggesting that our model is promising for efficient, accurate computation of exact phase diagrams of many-body lattice hamiltonians. Ziyan Zhu · Marios Mattheakis · Weiwei Pan · Efthimios Kaxiras 🔗 - Strong-Lensing Source Reconstruction with Denoising Diffusion Restoration Models (Poster) []  Analysis of galaxy–galaxy strong lensing systems is strongly dependent on any prior assumptions made about the appearance of the source. Here we present a method of imposing a data-driven prior/regularisation for source galaxies based on denoising diffusion probabilistic models (DDPMs). We use a pre-trained model for galaxy images, AstroDDPM, and a chain of conditional reconstruction steps called denoising diffusion restoration model (DDRM) to obtain samples consistent both with the noisy observation and with the distribution of training data for AstroDDPM. We show that these samples have the qualitative properties associated with the posterior for the source model: in a low-to-medium noise scenario they closely resemble the observation, while reconstructions from uncertain data show greater variability, consistent with the distribution encoded in the generative model used as prior. Konstantin Karchev · Noemi Anau Montel · Adam Coogan · Christoph Weniger 🔗 - Score-based Seismic Inverse Problems (Poster) []  We present a new family of score-based models designed specifically for seismic migration. We define a sequence of corruptions obtained by migration artifacts created by reverse time migration (RTM) as the number of measurements changes. Our network is conditioned on the number of source locations and refines the reconstructed image over an annealed sequence of steps. Experiments on synthetic seismic data show that we can reconstruct geological details using a very small number of sources. Our method produces significantly higher-quality images compared to posterior sampling using standard score-based generative models and supervised seismic migration baselines. Sriram Ravula · Dimitri Voytan · Elad Liebman · Ram Tuvi · Yash Gandhi · Hamza Ghani · Alex Ardel · Mrinal Sen · Alex Dimakis 🔗 - Deep-pretrained-FWI: combining supervised learning with physics-informed neural network (Poster) []  An accurate velocity model is essential to make a good seismic image. Conventional methods to perform Velocity Model Building (VMB) tasks rely on inverse methods, which, despite being widely used, are ill-posed problems that require intense and specialized human supervision. Convolutional Neural Networks (CNN) have been extensively investigated as an alternative to solve the VMB task. Two main approaches were investigated in the literature: supervised training and Physics-Informed Neural Networks (PINN). Supervised training presents some generalization issues since structures, and velocity ranges must be similar in training and test set. Some works integrated Full-waveform Inversion (FWI) with CNN, defining the problem of VMB in the PINN framework. In this case, the CNN stabilizes the inversion, acting like a regularizer and avoiding local minima-related problems and, in some cases, sparing an initial velocity model.Our approach combines supervised and physics-informed neural networks by using transfer learning to start the inversion. The pre-trained CNN is obtained using a supervised approach based on training with a reduced and simple data set to capture the main velocity trend at the initial FWI iterations. We show that transfer learning reduces the uncertainties of the process, accelerates model convergence, and improves the final scores of the iterative process. ANA PAULA MULLER · Clecio Roque Bom · Jessé Carvalho Costa · Elisângela Lopes Faria · Marcelo Portes de Albuquerque · Marcio Portes de Albuquerque 🔗 - Differentiable composition for model discovery (Poster) []  We propose DiffComp, a symbolic regressor that can learn arbitrary function compositions, including derivatives of various orders. We use DiffComp in conjunction with a Physics Informed Neural Network (PINN) to discover differential equations from data. DiffComp has a layered structure where a set of user-defined basic functions are composed up to a specified depth. As it is differentiable, it can be trained using gradient descent. We test the architecture using simulated data from common PDEs and compare to existing model discovery frameworks, including PySINDy and DeePyMoD. We then test on oceanographic data. Omer Rochman Sharabi · Gilles Louppe 🔗 - Improving Generalization with Physical Equations (Poster) Hybrid modelling reduces the misspecification of expert physical models with a machine learning (ML) component learned from data. Similarly to many ML algorithms, hybrid model performance guarantees are limited to the training distribution. To address this limitation, here we introduce a hybrid data augmentation strategy, termed \textit{expert augmentation}. Based on a probabilistic formalization of hybrid modelling, we demonstrate that expert augmentation improves generalization. We validate the practical benefits of expert augmentation on a set of simulated and real-world systems described by classical mechanics. Antoine Wehenkel · Jens Behrmann · Hsiang Hsu · Guillermo Sapiro · Gilles Louppe · Joern-Henrik Jacobsen 🔗 - Neural Fields for Fast and Scalable Interpolation of Geophysical Ocean Variables (Poster) []  Optimal Interpolation (OI) is a widely used, highly trusted algorithm for interpolation and reconstruction problems in geosciences. With the influx of more satellite missions, we have access to more and more observations and it is becoming more pertinent to take advantage of these observations in applications such as forecasting and reanalysis. With the increase of the volume of available data, scalability remains an issue for standard OI and it prevents many practitioners from effectively and efficiently taking advantage of these large sums of data to learn the model hyperparameters. In this work, we leverage recent advances in Neural Fields (NerFs) as an alternative to the OI framework where we show how they can be easily applied to standard reconstruction problems in physical oceanography. We illustrate the relevance of NerFs for gap-filling of sparse measurements of sea surface height (SSH) via satellite altimetry and demonstrate how NerFs are scalable with comparable results to the standard OI. We find that NerFs are a practical set of methods that can be readily applied to geoscience interpolation problems and we anticipate a wider adoption in the future. Juan Emmanuel Johnson · Redouane Lguensat · ronan fablet · Emmanuel Cosme · Julien Le Sommer 🔗 - Interpretable Encoding of Galaxy Spectra (Poster) []  We present a novel loss function to train autoencoder models for galaxy spectra. Our architecture reliably captures intrinsic spectral features regardless of redshift, providing highly realistic reconstructions for SDSS galaxy spectra using as little as two latent parameters. But the interpretation of encoded parameters remains difficult because the decoding process is non-linear and the latent space can be highly degenerate: different latent positions can map to virtually indistinguishable spectra.To resolve this encoding ambiguity, we introduce a new similarity loss, which explicitly links latent-space distances to data-space distances. Minimizing the similarity loss together with the common fidelity loss leads to non-degenerate, highly accurate spectrum models that generalize over variations in noise, masking, and redshift, while providing a latent space distribution with clear separations between common and anomalous data. Yan Liang · Peter Melchior · Sicong Lu 🔗 - Neural Network Prior Mean for Particle Accelerator Injector Tuning (Poster) []  Bayesian optimization has been shown to be a powerful tool for solving black box problems during online accelerator optimization. The major advantage of Bayesian based optimization techniques is the ability to include prior information about the problem to speed up optimization, even if that information is not perfectly correlated with experimental measurements. In parallel, neural network surrogate system models of accelerator facilities are increasingly being made available, but at present they are not widely used in online optimization. In this work, we demonstrate the use of an approximate neural network surrogate model as a prior mean for Gaussian processes used in Bayesian optimization in a realistic setting. We show that the initial performance of Bayesian optimization is improved by using neural network surrogate models, even when surrogate models make erroneous predictions. Finally, we quantify requirements on surrogate prediction accuracy to achieve optimization performance when solving problems in high dimensional input spaces. Connie Xu · Ryan Roussel · Auralee Edelen 🔗 - Applications of Differentiable Physics Simulations in Particle Accelerator Modeling (Poster) []  Current physics models used to interpret experimental measurements of particle beams require either simplifying assumptions to be made in order to ensure analytical tractability, or black box optimization methods to perform model based inference. This reduces the quantity and quality of information gained from experimental measurements, in a system where measurements have a limited availability. However differentiable physics modeling, combined with machine learning techniques, can overcome these analysis limitations, enabling accurate, detailed model creation of physical accelerators. Here we examine two applications of differentiable modeling, first to characterize beam responses to accelerator elements exhibiting hysteretic behavior, and second to characterize beam distributions in high dimensional phase spaces. Ryan Roussel · Auralee Edelen 🔗 - A robust estimator of mutual information for deep learning interpretability (Poster) []  We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI, an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to hyperparameter choices and provides the uncertainty on the MI estimate due to the finite sample size. We demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. Davide Piras · Hiranya Peiris · Andrew Pontzen · Luisa Lucie-Smith · Brian Nord · Ningyuan (Lillian) Guo 🔗 - Finding NEEMo: Geometric Fitting using Neural Estimation of the Energy Mover’s Distance (Poster) []  A novel neural architecture was recently developed that enforces an exact upperbound on the Lipschitz constant of the model by constraining the norm of its weights in a minimal way, resulting in higher expressiveness compared to other techniques. We present a new and interesting direction for this architecture: estimation of the Wasserstein metric (Earth Mover’s Distance) in optimal transport by employing the Kantorovich-Rubinstein duality to enable its use in geometric fitting applications. Specifically, we focus on the field of high-energy particle physics, where it has been shown that a metric for the space of particle-collider events can be defined based on the Wasserstein metric, referred to as the Energy Mover’s Distance (EMD). This metrization has the potential to revolutionize data-driven collider phenomenology. The work presented here represents a major step towards realizing this goal by providing a differentiable way of directly calculating the EMD. We show how the flexibility that our approach enables can be used to develop novel clustering algorithms. Ouail Kitouni · Mike Williams · Niklas S Nolte 🔗 - DIGS: Deep Inference of Galaxy Spectra with Neural Posterior Estimation (Poster) []  With the advent of billion-galaxy surveys with complex data, the need of the hour is to efficiently model galaxy spectral energy distributions (SEDs) with robust uncertainty quantification. The combination of Simulation-Based inference (SBI) and amortized Neural Posterior Estimation (NPE) has been successfully used to analyse simulated and real galaxy photometry both precisely and efficiently.Here, we demonstrate a proof-of-concept study of spectra that is a) an efficient analysis of galaxy SEDs and inference of galaxy parameters with physically interpretable uncertainties; and b) amortized calculations of posterior distributions of said galaxy parameters at the modest cost of a few galaxy fits with MCMC methods. We show that SBI is capable of inferring very accurate galaxy stellar masses and metallicities. Our methodology also a) produces uncertainty constraints that are comparable to or moderately weaker than traditional inverse-modeling with Bayesian MCMC methods (e.g., 0.17 and 0.26 dex in stellar mass and metallicity for a given galaxy, respectively), and b) conducts rapid SED inference (~10^5 galaxy spectra via SBI/SNPE at the cost of 1 MCMC-based fit); this efficiency is needed in the era of JWST and Roman Telescope. Gourav Khullar · Brian Nord · Aleksandra Ciprijanovic · Jason Poh · Fei Xu · Ashwin Samudre 🔗 - Strong Lensing Parameter Estimation on Ground-Based Imaging Data Using Simulation-Based Inference (Poster) []  Current ground-based cosmological surveys, such as the Dark Energy Survey (DES), are predicted to discover thousands of galaxy-scale strong lenses, while future surveys, such as the Vera Rubin Observatory Legacy Survey of Space and Time (LSST) will increase that number by 1-2 orders of magnitude. The large number of strong lenses discoverable in future surveys will make strong lensing a highly competitive and complementary cosmic probe.To leverage the increased statistical power of the lenses that will be discovered through upcoming surveys, automated lens analysis techniques are necessary. We present two Simulation-Based Inference (SBI) approaches for lens parameter estimation of galaxy-galaxy lenses. We demonstrate successful application of Neural Density Estimators (NPE) to automate the inference of a 12-parameter lens mass model for DES-like ground-based imaging data. We compare our NPE constraints to a Bayesian Neural Network (BNN) and find that it outperforms the BNN, producing posterior distributions that are for the most part both more accurate and more precise; in particular, several source-light model parameters are systematically biased in the BNN implementation. Jason Poh · Ashwin Samudre · Aleksandra Ciprijanovic · Brian Nord · Joshua Frieman · Gourav Khullar 🔗 - Closing the resolution gap in Lyman alpha simulations with deep learning (Poster) []  In recent years, super-resolution and related approaches powered by deep neural networks have emerged as a compelling option to accelerate computationally expensive cosmological simulations, which require modeling complex multi-physics systems in large spatial volumes. However, training such models in a physically consistent way is not always feasible or well-defined, as the data volume output by a super-resolution model may be too large, and the spatiotemporal dynamics of the simulation as well as the statistics of key observables like Lyman alpha flux are very sensitive to changes in resolution. In this work we address both challenges simultaneously, training neural networks to synthesize \Lya{} and other hydrodynamic fields with correct statistics on the relevant length scales but represented on the coarse grid of the input simulations. Effectively, our method is capable of 8x super-resolving a coarse simulation in-place without increasing memory footprint, using just a single pair of simulations for training. With chunked inference, we are able to apply the model to simulations of arbitrary size after training, and demonstrate this capability on a very large volume simulation spanning 600 Mpc/$h$. Cooper Jacobus · Peter Harrington · Zarija Lukić 🔗 - Physics-Informed Convolutional Neural Networks for Corruption Removal on Dynamical Systems (Poster) []  Measurements on dynamical systems, experimental or otherwise, are often subjected to inaccuracies capable of introducing corruption; removal of which is a problem of fundamental importance in the physical sciences. In this work we propose physics-informed convolutional neural networks for stationary corruption removal, providing the means to extract physical solutions from data, given access to partial ground-truth observations at collocation points. We showcase the methodology for 2D incompressible Navier-Stokes equations in the chaotic-turbulent flow regime, demonstrating robustness to modality and magnitude of corruption. Daniel Kelshaw · Luca Magri 🔗 - Do graph neural networks learn jet substructure? (Poster) []  At the CERN LHC, the task of jet tagging, whose goal is to infer the origin of a jet given a set of final-state particles, is dominated by machine learning methods. Graph neural networks have been used to address this task by treating jets as point clouds with underlying, learnable, edge connections between the particles inside. We explore the decision-making process for one such state-of-the-art network, ParticleNet, by looking for relevant edge connections identified using the layerwise-relevance propagation technique. As the model is trained, we observe changes in the distribution of relevant edges connecting different intermediate clusters of particles, known as subjets. The resulting distribution of subjet connections is different for signal jets originating from top quarks, whose subjets typically correspond to its three decay products, and background jets originating from lighter quarks and gluons. This behavior indicates that the model is using traditional jet substructure observables, such as the number of prongs—energetic particle clusters—within a jet, when identifying jets. Farouk Mokhtar · Raghav Kansal · Javier Duarte 🔗 - Thermophysical Change Detection on the Moon with the Lunar Reconnaissance Orbiter Diviner sensor (Poster) []  The Moon is an archive for the history of the Solar System, as it has recorded and preserved physical events that have occurred over billions of years. NASA's Lunar Reconnaissance Orbiter (LRO) has been studying the lunar surface for more than 13 years, and its datasets contain valuable information about the evolution of the Moon. However, the vast amount of data collected by LRO makes the extraction of scientific insights very challenging - in the past, the majority of analyses relied on human review. Here, we present NEPHTHYS, an automated solution for discovering thermophysical changes on the surface using one of LRO's largest datasets: the thermal data collected by its Diviner instrument. Specifically, NEPHTHYS is able to perform systematic, efficient, and large-scale change detection of present-day impact craters on the surface. With further work, it could enable more comprehensive studies of lunar surface impact flux rates and surface evolution rates, providing critical new information for future missions. Jose Delgado-Centeno · Silvia Bucci · Ziyi Liang · Ben Gaffinet · Valentin T. Bickel · Ben Moseley · Miguel Olivares 🔗 - Source Identification and Field Reconstruction of Advection-Diffusion Process from Sparse Sensor Measurements (Poster) []  Inferring the source information of greenhouse gases, such as methane, from spatially sparse sensor observations is an essential element in mitigating climate change. While it is well understood that the complex behavior of the atmospheric dispersion of such pollutants is governed by the Advection-Diffusion equation, it is difficult to directly apply the governing equations to identify the source information because of the spatially sparse observations, i.e., the pollution concentration is known only at the sensor locations. Here, we develop a multi-task learning framework that can provide high-fidelity reconstruction of the concentration field and identify emission characteristics of the pollution sources such as their location, emission strength, etc. from sparse sensor observations. We demonstrate that our proposed framework is able to achieve accurate reconstruction of the methane concentrations from sparse sensor measurements as well as precisely pin-point the location and emission strength of these pollution sources. Arka Daw · Kyongmin Yeo · Anuj Karpatne · 🔗 - Geometry-aware Autoregressive Models for Calorimeter Shower Simulations (Poster) []  Calorimeter shower simulations are often the bottleneck in simulation time for particle physics detectors. A lot of effort is currently spent on optimising generative architectures for specific detector geometries, which generalise poorly. We develop a geometry-aware autoregressive model on a range of calorimeter geometries such that the model learns to adapt its energy deposition depending on the size and position of the cells. This is a key proof-of-concept step towards building a model that can generalize to new unseen calorimeter geometries with little to no additional training. Such a model can replace the hundreds of generative models used for calorimeter simulation a Large Hadron Collider experiment. For the study of future detectors, such a model will dramatically reduce the large upfront investment usually needed in generating simulations. Junze Liu · Aishik Ghosh · Dylan Smith · Pierre Baldi · Daniel Whiteson 🔗 - Characterizing information loss in a chaotic double pendulum with the Information Bottleneck (Poster) []  A hallmark of chaotic dynamics is the loss of information with time. Although information loss is often expressed through a connection to Lyapunov exponents---valid in the limit of high information about the system state---this picture misses the rich spectrum of information decay across different levels of granularity. Here we show how machine learning presents new opportunities for the study of information loss in chaotic dynamics, with a double pendulum serving as a model system. We use the Information Bottleneck as a training objective for a neural network to extract information from the state of the system that is optimally predictive of the future state after a prescribed time horizon. We then decompose the optimally predictive information by distributing a bottleneck to each state variable, recovering the relative importance of the variables in determining future evolution. The framework we develop is broadly applicable to chaotic systems and pragmatic to apply, leveraging data and machine learning to monitor the limits of predictability and map out the loss of information. Kieran Murphy · Danielle S Bassett 🔗 - Detecting structured signals in radio telescope data using RKHS (Poster) []  Fast Radio Bursts (FRBs) are rare high-energy pulses detectable by radio telescopes whose physical description is currently unknown. Due to the volume of data produced by radio telescopes, efficient computational methods for automatically detecting FRBs and other signals of interest are required. The most basic of these methods involves fitting a physical model of frequency despersion to the observed signal, and flagging a detection if the dedispersed signal has high power. This method can successfully detect simple pulses, but can fail to detect other interesting astronomical signals. We propose a method for dedispersion that does not use a physical model but instead uses a flexible element of a reproducing kernel Hilbert space (RKHS). Our method can outperform classical dedispersion on a benchmark of real and synthetic data consisting of FRBs and non-physical signals. Russell Tsuchida · Suk Yee Yong 🔗 - Statistical Inference for Coadded Astronomical Images (Poster) []  Coadded astronomical images are created by combining multiple single-exposure images. Because coadded images are smaller in terms of data size than the single-exposure images they summarize, loading and processing them is less computationally expensive. However, image coaddition introduces additional dependence among pixels, which complicates principled statistical analysis of them. We present a novel fully Bayesian approach for performing light source parameter inference on coadded astronomical images. Our method implicitly marginalizes over the single-exposure pixel intensities that contribute to the coadded images, giving it the computational efficiency necessary to scale to next-generation astronomical surveys. As a proof of concept, we show that our method for estimating the locations and fluxes of stars using simulated coadds outperforms a method trained on single-exposure images. Mallory Wang · Ismael Mendoza · Jeffrey Regier · Camille Avestruz · Cheng Wang 🔗 - Domain Adaptation for Simulation-Based Dark Matter Searches with Strong Gravitational Lensing (Poster) []  The application of machine learning for quantifying dark matter substructure is growing in popularity. However, due to the differences with the real instrumental data, machine learning models trained on simulations are expected to lose accuracy when applied to real data. Here, domain adaptation can serve as a crucial bridge between simulations and real data applications. In this work, we demonstrate the power of domain adaptation techniques applied to strong gravitational lensing data with dark matter substructure. We show with simulated data sets representative of Euclid and Hubble Space Telescope (HST) observations that domain adaptation can significantly mitigate the losses in the model performance when applied to new domains. Pranath Reddy Kumbam · Sergei Gleyzer · Michael Toomey · Marcos Tidball 🔗 - A hybrid Reduced Basis and Machine-Learning algorithm for building Surrogate Models: a first application to electromagnetism (Poster) A surrogate model approximates the outputs of a Partial Differential Equations (PDEs) solver with a low computational cost. In this article, we propose a method to build learning-based surrogates in the context of parameterized PDEs, which are PDEs that depend on a set of parameters but are also temporal and spatial processes. Our contribution is a method hybridizing the Proper Orthogonal Decomposition and several Support Vector Regression machines. We present promising results on a first electromagnetic use case (a primitive single-phase transformer). Alejandro Ribes · Ruben Persicot · Lucas Meyer · Jean-Pierre Ducreux 🔗 - Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian (Poster) []  Incorporating the Hamiltonian structure of physical dynamics into deep learning models provides a powerful way to improve the interpretability and prediction accuracy. While previous works are mostly limited to the Euclidean spaces, their extension to the Lie group manifold is needed when rotations form a key component of the dynamics, such as the higher-order physics beyond simple point-mass dynamics for N-body celestial interactions. Moreover, the multiscale nature of these processes presents a challenge to existing methods as a long time horizon is required. By leveraging a symplectic Lie-group manifold preserving integrator, we present a method for data-driven discovery of non-Newtonian astronomy. Preliminary results show the importance of both these properties in training stability and prediction accuracy. Oswin So · Gongjie Li · Evangelos Theodorou · Molei Tao 🔗 - Deconvolving Detector Effects for Distribution Moments (Poster) []  Deconvolving (unfolding') detector distortions is a critical step in the comparison of cross section measurements with theoretical predictions in particle and nuclear physics. However, most extant unfolding approaches require histogram binning while many theoretical predictions are at the level of moments. We develop a new approach to directly unfold distribution moments as a function of other observables without having to first discretize the data. Our Moment Unfolding technique uses machine learning and is inspired by Generative Adversarial Networks (GANs). We demonstrate the performance of this approach using jet substructure measurements in collider physics. Krish Desai · Benjamin Nachman · Jesse Thaler 🔗 - Multi-scale Digital Twin: Developing a fast and physics-infused surrogate model for groundwater contamination with uncertain climate models (Poster) []  Soil and groundwater contamination is a pervasive problem at thousands of locations across the world. Contaminated sites often require decades to remediate or to monitor natural attenuation. Climate change exacerbates the long-term site management problem because extreme precipitation and/or shifts in precipitation/evapotranspiration regimes could re-mobilize contaminants and proliferate affected groundwater. To quickly assess the spatiotemporal variations of groundwater contamination under uncertain climate disturbances, we developed a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Operator (U-FNO) to solve Partial Differential Equations (PDEs) of groundwater flow and transport simulations at the site scale. We develop a combined loss function that includes both data-driven factors and physical boundary constraints at multiple spatiotemporal scales. Our U-FNOs can reliably predict the spatiotemporal variations of groundwater flow and contaminant transport properties from 1954 to 2100 with realistic climate projections. In parallel, we develop a convolutional autoencoder combined with online clustering to reduce the dimensionality of the vast historical and projected climate data by quantifying climatic region similarities across the United States. The ML-based unique climate clusters provide climate projections for the surrogate modeling and help return reliable future recharge rate projections immediately without querying large climate datasets. In all, this Multi-scale Digital Twin work can advance the field of environmental remediation under climate change. Lijing Wang · Takuya Kurihana · Aurelien Meray · Ilijana Mastilovic · Satyarth Praveen · Zexuan Xu · Milad Memarzadeh · Alexander Lavin · Haruko Wainwright 🔗 - Topological Jet Tagging (Poster) []  Proton-proton collisions at the large hadron collider result in the creation of unstable particles. The decays of many of these particles produce collimated sprays of particles referred to as jets. To better understand the physics processes occurring in the collisions, one needs to classify the jets, a process known as jet tagging. Given the enormous amount of data generated during such experiments, and the subtleties between different signatures, jet tagging is of vital importance and allows us to discard events which are not of interest --- a critical part of dealing with such high-throughput data. We present a new approach to jet tagging that leverages topological properties of jets to capture their inherent shape. Our method respects underlying physical symmetries, is robust to noise, and exhibits predictive performance on par with more complex, heavily-parametrized approaches. Dawson Thomas · Sarah Demers · Smita Krishnaswamy · Bastian Rieck 🔗 - Physics-Driven Convolutional Autoencoder Approach for CFD Data Compressions (Poster) []  With the growing size and complexity of turbulent flow models, data compression approaches are of the utmost importance to analyze, visualize, or restart the simulations. Recently, in-situ autoencoder-based compression approaches have been proposed and shown to be effective at producing reduced representations of turbulent flow data. However, these approaches focus solely on training the model using point-wise sample reconstruction losses that do not take advantage of the physical properties of turbulent flows. In this paper, we show that training autoencoders with additional physics-informed regularizations, e.g., enforcing incompressibility and preserving enstrophy, improves the compression model in three ways: (i) the compressed data better conform to known physics for homogeneous isotropic turbulence without negatively impacting point-wise reconstruction quality, (ii) inspection of the gradients of the trained model uncovers changes to the learned compression mapping that can facilitate the use of explainability techniques, and(iii) as a performance byproduct, training losses are shown to converge up to 12x faster than the baseline model. Alberto Olmo · Ahmed Zamzam · Andrew Glaws · Ryan King 🔗 - Recovering Galaxy Cluster Convergence from Lensed CMB with Generative Adversarial Networks (Poster) []  We present a new method which leverages conditional Generative Adversarial Networks (cGAN) to reconstruct galaxy cluster convergence from lensed CMB temperature maps. Our model is constructed to emphasize structure and high-frequency correctness relative to the Residual U-Net approach presented by Caldeira, et. al. (2019). Ultimately, we demonstrate that while both models perform similarly in the no-noise regime (as well as after random off-centering of the cluster center), cGAN outperforms ResUNet when processing CMB maps noised with 5uK/arcmin white noise or astrophysical foregrounds (tSZ and kSZ); this out-performance is especially pronounced at high l, which is exactly the regime in which the ResUNet under-performs traditional methods. Liam Parker · Dongwon Han · Shirley Ho · Pablo Lemos 🔗 - DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking (Poster) []  Predicting the binding structure of a small molecule ligand to a protein---a task known as molecular docking---is critical to drug design. Recent deep learning methods that treat docking as a regression problem have decreased runtime compared to traditional search-based methods but have yet to offer substantial improvements in accuracy. We instead frame molecular docking as a generative modeling problem and develop DiffDock, a diffusion generative model over the non-Euclidean manifold of ligand poses. To do so, we map this manifold to the product space of the degrees of freedom (translational, rotational, and torsional) involved in docking and develop an efficient diffusion process on this space. Empirically, DiffDock obtains a 38% top-1 success rate (RMSD<2Å) on PDBBind, significantly outperforming the previous state-of-the-art of traditional docking (23%) and deep learning (20%) methods. Moreover, DiffDock has fast inference times and provides confidence estimates with high selective accuracy. Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola 🔗 - Normalizing Flows for Fragmentation and Hadronization (Poster) []  Hadronization is an important step in Monte Carlo event generators, where quarks and gluons are bound into physically observable hadrons. Previous work has demonstrated first steps towards a machine-learning (ML) based simulation of the hadronization process. However, the presented architectures are limited to producing only pions as hadron emissions. In this work we use normalizing flows to overcome this limitation. We use masked autoregressive flows as a generator for the kinematic distributions in the hadronization pipeline. We condition NFs on different hadron masses and initial configuration energies, which allows for the emission of hadrons with arbitrary masses. The NF generated kinematic distributions match the Pythia generated ones well. In this paper we present our preliminary results. Ahmed Youssef · Philip Ilten · Tony Menzo · Jure Zupan · Manuel Szewc · Stephen Mrenna · Michael K. Wilkinson 🔗 - Astronomical Image Coaddition with Bundle-Adjusting Radiance Fields (Poster) []  Image coaddition is of critical importance to observational astronomy. This family of methods consisting of several processing steps such as image registration, resampling, deconvolution, and artifact removal is used to combine images into a single higher-quality image. An alternative to these methods that are built upon vectorized operations is the representation of an image function as a neural network, which has had considerable success in machine learning image processing applications. We propose a deep learning method employing gradient-based planar alignment with Bundle-Adjusting Radiance Fields (BARF) to combine, de-noise, and remove obstructions from observations of cosmological objects at different resolutions, seeing, and noise levels -- tasks not currently possible within a single process in astronomy. We test our algorithm on artificial images of star clusters, demonstrating powerful artifact removal and de-noising. Harlan Hutton · Harshitha Palegar · Shirley Ho · Miles Cranmer · Peter Melchior · Jenna Eubank 🔗 - Differentiable Physics-based Greenhouse Simulation (Poster) []  We present a differentiable greenhouse simulation model based on physical processes whose parameters can be obtained by training from real data. The physics-based simulation model is fully interpretable and is able to do state prediction for both climate and crop dynamics in the greenhouse over very a long time horizon. The model works by constructing a system of linear differential equations and solving them to obtain the next state. We propose a procedure to solve the differential equations, handle the problem of missing unobservable states in the data, and train the model efficiently. Our experiment shows the procedure is effective. The model improves significantly after training and can simulate a greenhouse that grows cucumbers accurately. Nhat M. Nguyen · Hieu Tran · Minh Duong · Hanh Bui · Kenneth Tran 🔗 - Plausible Adversarial Attacks on Direct Parameter Inference Models in Astrophysics (Poster) []  In this work we explore the possibility of introducing biases in physical parameterinference models from adversarial-type attacks. In particular, we inject small amplitude systematics into inputs to a mixture density networks tasked with inferring cosmological parameters from observed data. The systematics are constructed analogously to white-box adversarial attacks. We find that the analysis network can be tricked into spurious detection of new physics in cases where standard cosmological estimators would be insensitive. This calls into question the robustness of such networks and their utility for reliably detecting new physics. Benjamin Horowitz · Peter Melchior 🔗 - GAN-Flow: A dimension-reduced variational framework for physics-based inverse problems (Poster) []  We propose GAN-Flow -- a modular inference approach that combines generative adversarial network (GAN) prior with a normalizing flow (NF) model to solve inverse problems in the lower-dimensional latent space of the GAN prior using variational inference. GAN-Flow leverages the intrinsic dimension reduction and superior sample generation capabilities of GANs, and the capability of NFs to efficiently approximate complicated posterior distributions. In this work, we apply GAN-Flow to solve two physics-based linear inverse problems. Results show that GAN-Flow can efficiently approximate the posterior distribution in such high-dimensional problems. Agnimitra Dasgupta · Dhruv Patel · Deep Ray · Erik Johnson · Assad Oberai 🔗 - Control and Calibration of GlueX Central Drift Chamber Using Gaussian Process Regression (Poster) []  The Gluonic Excitations (GlueX) experiment is designed to search for exotic hybrid mesons produced in photoproduction reactions and to study the hybrid meson spectrum predicted from Lattice Quantum Chromodynamics. For the first time, the GlueX Central Drift Chamber was autonomously controlled using machine learning (ML) to calibrate in real time while recording cosmic ray tracks. We demonstrate the ability of a Gaussian Process to predict the gain correction calibration factor used to determine a high voltage setting that will stabilize the CDC gain in response to changing environmental conditions. We demonstrate the use of a data-driven method to calibrate a drift chamber via high-voltage control during an experiment in contrast to the traditional, computationally expensive method of calibrating raw data after data collection is complete. Diana McSpadden · Torri Jeske · Naomi Jarvis · David Lawrence · Thomas Britton · nikhil kalra 🔗 - Emulating cosmological growth functions with B-Splines (Poster) []  In the light of GPU accelerations, sequential operations such as solving ordinary differential equations can be bottlenecks for gradient evaluations and hinder potential speed gains. In this work, we focus on growth functions and their time derivatives in cosmological particle mesh simulations and show that these are the majority time cost when using gradient based inference algorithms. We propose to construct novel conditional B-spline emulators which directly learn an interpolating function for the growth factor as a function of time, conditioned on the cosmology. We demonstrate that these emulators are sufficiently accurate to not bias our results for cosmological inference and can lead to over an order of magnitude gains in time, especially for small to intermediate size simulations. Ngai Pok Kwan · Chirag Modi · Yin Li · Shirley Ho 🔗 - ClimFormer - a Spherical Transformer model for long-term climate projections (Poster) []  Clouds play an important role in balancing the Earth's energy budget. Research has indicated a rise in global average temperatures will lead to thinning of stratocumulus low clouds acting as a positive feedback on warming. Current state-of-the-art Earth System Models do not resolve cloud physics appropriately due to spatial resolution limitations, making it harder to model the cloud-climate feedback. In this study, we propose to learn this feedback with a transformer. To better respect the spatial structure of Earth, we transform the data to a spherical grid. Our resulting spherical transformer called ClimFormer -- using state of the art Fourier Neural Operator mixing -- is able to model this important energy exchange mechanism, and performs strongly on an out-of-distribution evaluation. Salva Rühling Cachay · Peetak Mitra · Sookyung Kim · Subhashis Hazarika · Haruki Hirasawa · Dipti Hingmire · Hansi Singh · Kalai Ramea 🔗 - Computing the Bayes-optimal classifier and exact maximum likelihood estimator with a semi-realistic generative model for jet physics (Poster) []  Deep learning techniques have proven to be extremely effective in studying complicated, collimated sprays of particles found in high energy particle collisions known as jets. As with most realistic classification tasks, the Bayes-optimal classifier is unknown or intractable, even when trained with simulated data. Here we consider Ginkgo, a semi-realistic simulator for jets that captures the essential physics and produces data with similar features and format. By using a recently-developed hierarchical trellis data structure and dynamic programming algorithm, we are able to exactly marginalize over the combinatorically large space of latent variables associated to this generative model. This allows us to compute the Bayes-optimal classifier and the exact maximum likelihood estimator for this model, which can serve as a powerful benchmarking tool for studying the performance of machine learning approaches to these problems. Kyle Cranmer · Matthew Drnevich · Lauren Greenspan · Sebastian Macaluso · Duccio Pappadopulo 🔗 - The Senseiver: attention-based global field reconstruction from sparse observations (Poster) []  The reconstruction of complex time-evolving fields from a small number of sensor observations is a grand challenge in a wide range of scientific and industrial applications. Frequently, sensors have very sparse spatial coverage, and report noisy observations from highly non-linear phenomena. While numerical simulations can model some of these phenomena in a classical manner, the inverse problem is not well-posed, hence data-driven modeling can provide crucial disambiguation. Here we present the \textit{Senseiver}, an attention-based framework that excels in the task of reconstructing spatially-complex fields from a small number of observations. Building on the \textit{Perceiver IO} model, the Senseiver reconstructs complex \textit{n}-dimensional fields accurately using a small number of sensor observations by encoding arbitrarily-sized sparse sets of inputs into a latent space using cross-attention, which produces a uniform-sized space regardless of the number of observations. This same property allows very efficient training as a consequence of the being able to decode only a sparse set of observations as outputs. This enables efficient training of data with complex boundary conditions (sea temperature) and to extremely large and complex domains (3D porous media). We show that the Senseiver sets a new state of the art for three existing datasets, including real-world sea temperature observations, and pushes the bounds of sparse reconstruction using a large-scale simulation of two fluids flowing through a complex 3D domain. Javier E. Santos · Zachary Fox · Arvind Mohan · Hari Viswanathan · NIcholas Lubbers 🔗 - SE(3)-equivariant self-attention via invariant features (Poster) []  In this work, we use classical invariant theory to construct a self-attention module equivariant to 3D rotations and translations. The parameterization is based on the characterization of SE(3)-equivariant functions via the invariants ---scalar products of vectors and certain subdeterminants. This parameterization can be seen as a natural extension to a (more straightforward) E(3) equivariant attention based on invariants ---scalar products or pairwise distances of vectors. We evaluate our model using a toy N-body particle simulation dataset and a real-world dataset of molecular properties. Our model is easy to implement and it exhibits comparable performance and running time to state-of-the-art methods. Nan Chen · Soledad Villar 🔗 - Skip Connections for High Precision Regressors (Poster) []  Monte Carlo simulations of physical processes at particle colliders like the Large Hadron Collider at CERN take up a major fraction of the computational budget. For some simulations, a single data point takes seconds, minutes, or even hours to compute from first principles. Since the necessary number of data points per simulation is on the order of $10^9$ -- $10^{12}$, machine learning regressors can be used in place of physics simulators to reduce this computational burden significantly. However, this task requires high-precision regressors that can deliver data with relative errors less than 1\% or even 0.1\% over the entire domain of the function. In this paper, we develop optimal training strategies and tune various machine learning regressors to satisfy the high-precision requirement. We leverage symmetry arguments from particle physics to optimize the performance of the regressors. Inspired by ResNets, we design a Deep Neural Network with skip connections that outperform fully connected Deep Neural Networks. We find that at lower dimensions, boosted decision trees far outperform neural networks while at higher dimensions neural networks perform better. Our work can significantly reduce the training and storage burden of Monte Carlo simulations at current and future collider experiments. Ayan Paul · Fady Bishara · Jennifer Dy 🔗 - Likelihood-Free Frequentist Inference for Calorimetric Muon Energy Measurement in High-Energy Physics (Poster) []  Muons have proven to be excellent probes of new physical phenomena, but theprecision of traditional curvature-based measurements of their energy degradesat high energies. Recent work has shown the feasibility of a new avenue for theprecise estimation of high-energy muons by exploiting the pattern of energy lossesin a dense, finely segmented calorimeter using convolutional neural networks(CNNs). However, CNN predictions of the muon energy suffered from significantbias, which hampers the reliability of traditional methods for quantifying theuncertainty of the estimates. Indeed, to date, there is no known solution to thegeneral problem of producing reliable uncertainty estimates of internal parametersof a statistical model from point predictions. In this paper, we propose WALDO,a new method that reframes the Wald test and uses the Neyman construction toconvert point predictions into valid confidence sets. We show that WALDO achievesconfidence sets with correct coverage regardless of the true muon energy value,while leveraging predictions from a CNN over a high-dimensional input space. Inaddition, we show that despite an increasing dimensionality, WALDO is able toextract useful information from a finer segmentation of the calorimeter, yieldingsmaller confidence sets, and hence more precise estimates of the muon energies. Luca Masserano · Ann Lee · Rafael Izbicki · Mikael Kuusela · tommaso dorigo 🔗 - Uncertainty Aware Deep Learning for Particle Accelerators (Poster) []  Standard deep learning models for classification and regression applications are ideal for capturing complex system dynamics. However, their predictions can be arbitrarily inaccurate when the input samples are not similar to the training data. Implementation of distance aware uncertainty estimation can be used to detect these scenarios and provide a level of confidence associated with their predictions. In this paper, we present results from using Deep Gaussian Process Approximation (DGPA) methods for errant beam prediction at Spallation Neutron Source (SNS) accelerator (classification) and we provide an uncertainty aware surrogate model for the Fermi National Accelerator Lab (FNAL) Booster Accelerator Complex (regression). Kishansingh Rajput · Malachi Schram · Karthik Somayaji NS 🔗 - Graphical Models are All You Need: Per-interaction reconstruction uncertainties in a dark matter detection experiment (Poster) []  We demonstrate that Bayesian networks fill a significant methodology gap for uncertainty quantification in particle physics, providing a framework for modeling complex systems with physical constraints. To address the problem of interaction position reconstruction in dark matter direct-detection experiments, we built a Bayesian network that utilizes domain knowledge of the system in both the structure of the graph and the representation of the random variables. This method yielded highly informative per-interaction uncertainties that were previously unattainable using existing methodologies, while also demonstrating comparable precision on reconstructed positions. Christina Peters · Aaron Higuera · Shixiao Liang · Waheed Bajwa · Christopher Tunnell 🔗 - PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics (Poster) []  Many current approaches to machine learning in particle physics use generic architectures that require large numbers of parameters, often adapted from unrelated data science or industry applications, and disregard underlying physics principles, thereby limiting their applicability as scientific modeling tools. In this work, we present a machine learning architecture that uses a set of inputs maximally reduced with respect to the full 6-dimensional Lorentz symmetry, and is fully permutation-equivariant throughout. We study the application of this network architecture to the standard task of classifying the origin of jets produced by either hadronically-decaying massive top quarks or light quarks, and show that the resulting network outperforms all existing competitors despite significantly lower model complexity. In addition, we present a Lorentz-covariant variant of the same network applied to a 4-momentum regression task in which we predict the full 4-vector of the W boson from a top quark decay process. Jan Offermann · Alexander Bogatskiy · Timothy Hoffman · David W Miller 🔗 - FO-PINNs: A First-Order formulation for Physics~Informed Neural Networks (Poster) []  We present FO-PINNs, physics-informed neural networks that are trained using the first-order formulation of the Partial Differential Equation (PDE) losses. We show that FO-PINNs offer significantly higher accuracy in solving parameterized systems compared to traditional PINNs, and reduce time-per-iteration by removing the extra backpropagations needed to compute the second or higher-order derivatives. Additionally, unlike standard PINNs, FO-PINNs can be used with exact imposition of boundary conditions using approximate distance functions, and can be trained using Automatic Mixed Precision (AMP) to further speed up the training. Through two Helmholtz and Navier-Stokes examples, we demonstrate the advantages of FO-PINNs over traditional PINNs in terms of accuracy and training speedup. FO-PINN has been developed using Modulus framework by NVIDIA and the source code for this is available in https://developer.nvidia.com/modulus. Rini Jasmine Gladstone · Mohammad Amin Nabian · Hadi Meidani 🔗 - Learning the nonlinear manifold of extreme aerodynamics (Poster) []  With the increased occurrence of extreme events and miniaturization of aircraft, it has become an urgent task to understand aerodynamics in highly turbulent flight environments. We propose a physics-embedded autoencoder to discover a low-dimensional compact manifold representation of extreme aerodynamics. The present method is demonstrated with the highly nonlinear dynamics of vortex gust-airfoil wake interaction around a NACA0012 airfoil over a range of configurations. The present model extracts key features of the high-dimensional airfoil wake dynamics on a physically interpretable and compact manifold, covering a massive number of wake scenarios across a huge parameter space that determines the characteristics of complex gusty flow conditions. Our data-driven approach offers a new avenue for expressing the seemingly high-dimensional fluid flow systems by identifying the low-dimensional data coordinates that can also be leveraged for data compression and flow control. Kai Fukami · Kunihiko Taira 🔗 - Geometric NeuralPDE (GNPnet) Models for Learning Dynamics (Poster) []  Real-world phenomena, such as cellular dynamics, electromagnetic wave propagation,and heat diffusion vary with respect to space and time and therefore candescribed by partial differential equations (PDEs). We focus on the problem offinding a dynamic model and parameterization that can generate and match observedtime-series data. For this purpose, we introduce Geometric Neural PDEnetwork (GNPnet), a neural network that learns to match and interpolate measuredphenomenon using an autoregressive framework. GPNnet has several novelfeatures including a geometric scattering network that leverages spatial problemstructure, and an FEM solver that is incorporated within the network. GPNnetlearns parameters of a PDE via an FEM solver that generates solution values thatare compared with measured phenomenon. By using the adjoint sensitivity methodto differentiate the output loss function, we can train the model end-to-end. Wedemonstrate GPNnet by learning the parameters of a simulated wave equation. Oluwadamilola Fasina · Smita Krishnaswamy · Aditi Krishnapriyan 🔗 - Can denoising diffusion probabilistic models generate realistic astrophysical fields? (Poster) []  Score-based generative models have emerged as alternatives to generative adversarial networks (GANs) and normalizing flows for tasks involving learning and sampling from complex image distributions. In this work we investigate the ability of these models to generate fields in two astrophysical contexts: dark matter mass density fields from cosmological simulations and images of interstellar dust. We examine the fidelity of the sampled cosmological fields relative to the true fields using three different metrics, and identify potential issues to address. We demonstrate a proof-of-concept application of the model trained on dust in denoising dust images. To our knowledge, this is the first application of this class of models to the interstellar medium. Nayantara Mudur · Douglas P. Finkbeiner 🔗 - PIPS: Path Integral Stochastic Optimal Control for Path Sampling in Molecular Dynamics (Poster) We consider the problem of \textit{Sampling Transition Paths}: Given two metastable conformational states of a molecular system, \eg\ a folded and unfolded protein, we aim to sample the most likely transition path between the two states. Sampling such a transition path is computationally expensive due to the existence of high free energy barriers between the two states. To circumvent this, previous work has focused on simplifying the trajectories to occur along specific molecular descriptors called Collective Variables (CVs). However, finding CVs is non trivial and requires chemical intuition. For larger molecules, where intuition is not sufficient, using these CV-based methods biases the transition along possibly irrelevant dimensions. In this work, we propose a method for sampling transition paths that considers the entire geometry of the molecules. We achieve this by relating the problem to recent works on the Schr\"odinger bridge problem and stochastic optimal control. Using this relation, we construct a \emph{path integral} method that incorporates important characteristics of molecular systems such as second-order dynamics and invariance to rotations and translations. We demonstrate our method on commonly studied protein structures like Alanine Dipeptide, and also consider larger proteins such as Polyproline and Chignolin. Lars Holdijk · Yuanqi Du · Ferry Hooft · Priyank Jaini · Berend Ensing · Max Welling 🔗 - Predicting Full-Field Turbulent Flows Using Fourier Neural Operator (Poster) []  We present an experimental application of Fourier neural operators (FNOs) for predicting the temporal development of wakes behind tandem bluff body arrangements at a Reynolds number of $Re \approx 1500$. FNOs are recently introduced tools in machine learning capable of approximating solution operators to partial differential equations, such as the Navier-Stokes equations, through data alone. Once trained, FNOs can predict full-field solutions in milliseconds. Here we apply this method to experimental velocity fields acquired via particle image velocimetry and compare the predicted temporal developments of the learned solution operator with the actual measurements taken at those timesteps. We find that FNOs are capable of accurately predicting wake developments hundreds of milliseconds into the future. Using several tandem cylinder configurations, we also demonstrate that learned solution operators are surprisingly capable of adapting to unseen conditions and generalizing wake dynamics across different arrangements. Peter Renn · Sahin Lale · Cong Wang · Zongyi Li · Anima Anandkumar · Morteza Gharib 🔗 - A Self-Supervised Approach to Reconstruction in Sparse X-Ray Computed Tomography (Poster) []  Computed tomography has propelled scientific advances in fields from biology to materials science. This technology allows for the elucidation of 3-dimensional internal structure by the attenuation of x-rays through an object at different rotations relative to the beam. By imaging 2-dimensional projections, a 3-dimensional object can be reconstructed through a computational algorithm. Imaging at a greater number of rotation angles allows for improved reconstruction. However, taking more measurements increases the x-ray dose and may cause sample damage. Deep neural networks have been used to transform sparse 2-D projection measurements to a 3-D reconstruction by training on a dataset of known similar objects. However, obtaining high-quality object reconstructions for the training dataset requires high x-ray dose measurements that can destroy or alter the specimen before imaging is complete. This becomes a chicken-and-egg problem: high-quality reconstructions cannot be generated without deep learning, and the deep neural network cannot be learned without the reconstructions. This work develops and validates a self-supervised probabilistic deep learning technique, the physics-informed variational autoencoder, to solve this problem. A dataset consisting solely of sparse projection measurements from each object is used to jointly reconstruct all objects of the set. This approach has the potential to allow visualization of fragile samples with x-ray computed tomography. We release our code for reproducing our results at: https://github.com/vganapati/CT_PVAE. Rey Mendoza · Minh Nguyen · Judith Weng Zhu · Talita Perciano · Vincent Dumont · Juliane Mueller · Vidya Ganapati 🔗 - Energy based models for tomography of quantum spin-lattice systems (Poster) []  We present a novel method to learn Energy-Based Models (EBM) from quantum tomography data. We represent quantum states via distributions generated by generalized measurements and use state-of-the-art algorithms for energy function learning to obtain a representation of these states as classical Gibbs distributions. Our results show that this method is especially well suited for learning quantum thermal states. For the case of ground states, we find that the learned EBMs often have an effective temperature that makes learning easier, especially in the paramagnetic phase. Abhijith Jayakumar · Marc Vuffray · Andrey Lokhov 🔗 - Elements of effective machine learning datasets in astronomy (Poster) []  In this work, we identify elements of effective machine learning datasets in as- tronomy and present suggestions for their design and creation. Machine learning has become an increasingly important tool for analyzing and understanding the large-scale flood of data in astronomy. To take advantage of these tools, datasets are required for training and testing. However, building machine learning datasets for astronomy can be challenging. Astronomical data is collected from instruments built to explore science questions in a traditional fashion rather than to conduct machine learning. Thus, it is often the case that raw data, or even downstream processed data is not in a form amenable to machine learning. We explore the construction of machine learning datasets and we ask: what elements define effec- tive machine learning datasets? We define effective machine learning datasets in astronomy to be formed with well-defined data points, structure, and metadata. We discuss why these elements are important for astronomical applications and ways to put them in practice. We posit that these qualities not only make the data suitable for machine learning, they also help to foster usable, reusable, and replicable science practices. Bernie Boscoe · Tuan Do 🔗 - Towards a non-Gaussian Generative Model of large-scale Reionization Maps (Poster) []  High-dimensional data sets are expected from the next generation of large-scale surveys. These data sets will carry a wealth of information about the early stages of galaxy formation and cosmic reionization. Extracting the maximum amount of information from the these data sets remains a key challenge. Current simulations of cosmic reionization are computationally too expensive to provide enough realizations to enable testing different statistical methods, such as parameter inference. We present a non-Gaussian generative model of reionization maps that is based solely on their summary statistics. We reconstruct large-scale ionization fields (bubble spatial distributions) directly from their power spectra (PS) and Wavelet Phase Harmonics (WPH) coefficients. Using WPH, we show that our model is efficient in generating diverse new examples of large-scale ionization maps from a single realization of a summary statistic. We compare our model with the target ionization maps using the bubble size statistics, and largely find a good agreement. As compared to PS, our results show that WPH provide optimal summary statistics that capture most of information out of a highly non-linear ionization fields. Yu-Heng Lin · Sultan Hassan · Bruno Régaldo-Saint Blancard · Michael Eickenberg · Chirag Modi 🔗 - Adversarial Noise Injection for Learned Turbulence Simulations (Poster) []  Machine learning is a powerful way to learn effective dynamics of physical simulations, and has seen great interest from the community in recent years. Recent work has shown that deep neural networks trained in an end-to-end manner seem capable to learn to predict turbulent dynamics on coarse grids more accurately than classical solvers. All these works point out that adding Gaussian noise to the input during training is indispensable to improve the stability and roll-out performance of learned simulators, as an alternative to training through multiple steps. In this work we bring insights from robust machine learning and propose to inject adversarial noise to bring machine learning systems a step further towards improving generalization in ML-assisted physical simulations. We advocate that training our models on these worst case perturbation instead of model-agnostic Gaussian noise might lead to better rollout and hope that adversarial noise injection becomes a standard tool for ML-based simulations. We show experimentally in the 2D-setting that for certain classes of turbulence adversarial noise can help stabilize model rollouts, maintain a lower loss and preserve other physical properties such as energy. In addition, we identify a potentially more challenging task, driven 2D-turbulence and show that while none of the noise-based attempts significantly improve rollout, adversarial noise helps. Jingtong Su · Julia Kempe · Drummond Fielding · Nikolaos Tsilivis · Miles Cranmer · Shirley Ho 🔗 - Shining light on data (Poster) []  Experimental sciences have come to depend heavily on our ability to organize, interpret and analyze high-dimensional datasets produced from observations of a large number of variables governed by natural processes. Natural laws, conservation principles, and dynamical structure introduce intricate inter-dependencies among these observed variables, which in turn yield geometric structure, with fewer degrees of freedom, on the dataset. We show how fine-scale features of this structure in data can be extracted from discrete approximations to quantum mechanical processes given by data-driven graph Laplacians and localized wavepackets. This leads to a novel, yet natural uncertainty principle for data analysis induced by limited data. We illustrate some applications to learning with algorithms on several model examples and real-world datasets. Akshat Kumar · Mohan Sarovar 🔗 - A Novel Automatic Mixed Precision Approach For Physics Informed Training (Poster) []  Physics Informed Neural Networks (PINNs) allow for a clean way of training models directly using physical governing equations. Training PINNs requires higher-order derivatives that typical data driven training does not require and increases training costs. In this work, we address the performance challenges of training PINNs by developing a new automatic mixed precision approach for physics informed training. This approach uses a derivative scaling strategy that enables the Automatic Mixed Precision (AMP) training for PINNs without running into training instabilities that the regular AMP approach encounters. Jinze Xue · Akshay Subramaniam · Mark Hoemmen 🔗 - Atmospheric retrievals of exoplanets using learned parameterizations of pressure-temperature profiles (Poster) []  We describe a new, learning-based approach for parameterizing the relationship between pressure and temperature in the atmosphere of an exoplanet. Our method can be used, for example, when estimating the parameters characterizing a planet's atmosphere from an observation of its spectrum with Bayesian inference methods (“atmospheric retrieval”). On two data sets, we show that our method requires fewer parameters and achieves, on average, better reconstruction quality than existing methods, all while still integrating easily into existing retrieval frameworks. This may help the analysis of exoplanet observations as well as the design of future instruments by speeding up inference, freeing up resources to retrieve more parameters, and paving a way to using more realistic atmospheric models for retrievals. Timothy Gebhard · Daniel Angerhausen · Björn Konrad · Eleonora Alei · Sascha Quanz · Bernhard Schölkopf 🔗 - Probabilistic Mixture Modeling For End-Member Extraction in Hyperspectral Data (Poster) Imaging spectrometers produce data with both spatial and spectroscopic resolution, a technique known as hyperspectral imaging (HSI). In a typical setting, the purpose of HSI is to disentangle a microscopic mixture of several material components in which each contributes a characteristic spectrum--often confounded by self-absorption effects, observation noise and other distortions. We outline a Bayesian mixture model enabling probabilistic inference of end member fractions while explicitly modeling observation noise and resulting inference uncertainties. We generate synthetic datasets and use Hamiltonian Monte Carlo to produce posterior samples that yield, for each set of observed spectra, an approximate distribution over end member coordinates. We find the model robust to the absence of pure (i.e. unmixed) observations as well as to the presence of non-isotropic Gaussian noise, both of which cause biases in the reconstructions produced by N-FINDER and other widespread end-member extraction algorithms. Oliver Hoidn · Aashwin Mishra · Apurva Mehta 🔗 - Posterior samples of source galaxies in strong gravitational lenses with score-based priors (Poster) Inferring accurate posteriors for high-dimensional representations of the brightness of gravitationally-lensed sources is a major challenge, in part due to the difficulties of accurately quantifying the priors. Here, we report the use of a score-based model to encode the prior for the inference of undistorted images of background galaxies. This model is trained on a set of high-resolution images of undistorted galaxies. By adding the likelihood score to the prior score and using a reverse-time stochastic differential equation solver, we obtain samples from the posterior. Our method produces independent posterior samples and models the data almost down to the noise level. We show how the balance between the likelihood and the prior meet our expectations in an experiment with out-of-distribution data. Alexandre Adam · Adam Coogan · Nikolay Malkin · Ronan Legin · Laurence Perreault-Levasseur · Yashar Hezaveh · Yoshua Bengio 🔗 - Particle-level Compression for New Physics Searches (Poster) []  In collider-based particle and nuclear physics experiments, data are produced at such extreme rates that only a subset can be recorded for later analysis. Typically, algorithms select individual collision events for preservation and store the complete experimental response. A relatively new alternative strategy is to additionally save a partial record for a subset of events, allowing for later specific analysis of a larger fraction of events. We propose a strategy that bridges these paradigms by compressing entire events for generic offline analysis but at a lower fidelity. An optimal-transport-based β Variational Autoencoder (VAE) is used to automate the compression and the hyperparameter β controls the compression fidelity. We introduce a new approach for multi-objective learning functions by simultaneously learning a VAE appropriate for all values of β through parameterization. We present an example use case, a di-muon resonance search at the Large Hadron Collider (LHC), where we show that simulated data compressed by our β VAE has enough fidelity to distinguish distinct signal morphologies. Yifeng Huang · Jack Collins · Benjamin Nachman · Simon Knapen · Daniel Whiteson 🔗 - CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds (Poster) []  Precision measurements and new physics searches at the Large Hadron Collider require efficient simulations of particle propagation and interactions within the detectors. The most computationally expensive simulations involve calorimeter showers. Advances in deep generative modelling -- particularly in the realm of high-dimensional data -- have opened the possibility of generating realistic calorimeter showers orders of magnitude more quickly than physics-based simulation. However, the high-dimensional representation of showers belies the relative simplicity and structure of the underlying physical laws. This phenomenon is yet another example of the manifold hypothesis from machine learning, which states that high-dimensional data is supported on low-dimensional manifolds. We thus propose modelling calorimeter showers first by learning their manifold structure, and then estimating the density of data across this manifold. Learning manifold structure reduces the dimensionality of the data, which enables fast training and generation when compared with competing methods. Jesse Cresswell · Brendan Ross · Gabriel Loaiza-Ganem · Humberto Reyes-Gonzalez · Marco Letizia · Anthony Caterini 🔗 - De-noising non-Gaussian fields in cosmology with normalizing flows (Poster) []  Fields in cosmology, such as the matter distribution, are observed by experiments up to experimental noise. The first step in cosmological data analysis is usually to de-noise the observed field using an analytic or simulation driven prior. On large enough scales, such fields are Gaussian, and the de-noising step is known as Wiener filtering. However, on smaller scales probed by upcoming experiments, a Gaussian prior is substantially sub-optimal because the true field distribution is very non-Gaussian. Using normalizing flows, it is possible to learn the non-Gaussian prior from simulations (or from more high-resolution observations), and use this knowledge to de-noise the data more effectively. We show that we can train a flow to represent the matter distribution of the universe, and evaluate how much signal-to-noise can be gained in idealized conditions, as a function of the experimental noise. We also introduce a patching method to reconstructing information on arbitrarily large images by dividing them up into small maps (where we reconstruct non-Gaussian features), and patching the small posterior maps together on large scales (where the field is Gaussian). Adam Rouhiainen · Moritz Münchmeyer 🔗 - Learning Integrable Dynamics with Action-Angle Networks (Poster) []  Machine learning has become increasingly popular for efficiently modelling the dynamics of complex physical systems, demonstrating a capability to learn effective models for dynamics which ignore redundant degrees of freedom. Learned simulators typically predict the evolution of the system in a step-by-step manner with numerical integration techniques. However, such models often suffer from instability over long roll-outs due to the accumulation of both estimation and integration error at each prediction step. Here, we propose an alternative construction for learned physical simulators that are inspired by the concept of action-angle coordinates from classical mechanics for describing integrable systems. We propose Action-Angle Networks, which learn a nonlinear transformation from input coordinates to the action-angle space, where evolution of the system is linear. Unlike traditional learned simulators, Action-Angle Networks do not employ any higher-order numerical integration methods, making them extremely efficient at modelling the dynamics of integrable physical systems. Ameya Daigavane · Arthur Kosmala · Miles Cranmer · Tess Smidt · Shirley Ho 🔗 - Physics-informed Bayesian Optimization of an Electron Microscope (Poster) Precise control of the electron beam probe is critical in scanning transmission electron microscopy (STEM) to understanding materials at atomic level. However, the nature of magnetic lenses introduces various orders of aberrations and make aberration corrector tuning a complex and time costly procedure. In this paper, we show that a deep neural network can accurately capture phase space variations from electron Ronchigrams, diffraction patterns from amorphous materials, allowing for the mapping to a single beam quality metric. A Bayesian approach is adopted to optimize the aberration correctors while providing the full posterior of the response to account for uncertainties. Furthermore, a deep kernel is implemented and shown to improve performance by effectively learning the correlations between input dimensions. This new scheme targets fully automated aberration corrector tuning, achieving greater speed and less human bias. Desheng Ma 🔗 - Why are deep learning-based models of geophysical turbulence long-term unstable? (Poster) Deep learning-based data-driven models of geophysical turbulence, e.g., data-driven weather forecasting models, have received substantial attention recently. These models, trained on observational data, are competitive with numerical weather prediction (NWP) models in terms of short-term performance and are devoid of numerical biases. They can be used for probabilistic forecasts with a large number of ensemble members, as well as efficient data assimilation at a computational cost which is several orders of magnitude smaller than that of NWP models. However, these data-driven models do not remain stable when integrated for a long time period (decadal time scales). This hinders their usefulness to simulate long-term climate statistics with synthetically generated data that could be used for studying the physical mechanisms of extreme events. A physical cause of this instability in data-driven models of weather, and generally turbulence, is yet-so-far unknown and several ad-hoc strategies are often adopted for improving their stability. In this work, we propose a causal mechanism for this instability through the lenses of physical and deep learning theory and propose an architecture-agnostic mitigation strategy to obtain long-term stable models of weather, climate, and generally geophysical turbulence. Ashesh Chattopadhyay · Pedram Hassanzadeh 🔗 - Graph Structure from Point Clouds: Geometric Attention is All You Need (Poster) []  The use of graph neural networks has produced significant advances in pointcloud problems, such as those found in high energy physics. The question ofhow to produce a graph structure in these problems is usually treated as a matterof heuristics, employing fully connected graphs or K-nearest neighbors. In thiswork, we elevate this question to utmost importance as the "Topology Problem". Wepropose an attention mechanism that allows a graph to be constructed in a learnedspace that captures all the relevant pairwise flow of information, potentially solvingthe Topology Problem. We test this architecture, called the Massless GravNet, onthe task of top jet tagging, and show that it is competitive in tagging accuracy, anduses far less computational resources than all other comparable models Daniel Murnane 🔗 - One-Class Dense Networks for Anomaly Detection (Poster) []  Unsupervised learning has been proposed as a tool for model agnostic anomaly detection (AD) in collider physics. While the goal of these approaches is usually to find events that are rare' under the Standard Model hypothesis, many approaches are governed by heuristics to strive towards an implicit density estimator. We study the simplest possible one-class classification method for unsupervised AD and show that it has similar properties to other unsupervised methods. The method is illustrated using a Gaussian dataset and a simulation of the events at the Large Hadron Collider (LHC). The simplicity of the one-class classification may enable a deeper understanding of unsupervised AD in the future. Norman Karr · Benjamin Nachman · David Shih 🔗 - Self-supervised detection of atmospheric phenomena from remotely sensed synthetic aperture radar imagery (Poster) []  The European Space Agency provides unprecedented monitoring of Earth's oceans through a network of Synthetic Aperture Radar (SAR) satellites called Sentinel-1. Imagery from these satellites captures a variety of atmosphere and ocean surface phenomena including waves, atmospheric turbulence, ocean fronts, and marine biology. Computer vision methods have been used to process the large number of acquired images, but the use of machine learning methods has been severely limited by sparsely labeled data. Consequently, we apply a self-supervised learning method, SwAV, to three years of Sentinel-1 satellite observations (3 million images) to learn an unsupervised embedding for SAR images, then fine-tune the model to detect wind streaks and mesoscale convection cells through supervised learning. Our results demonstrate detection performance improvement over the previous state-of-the-art model but suggest that self-supervised training has marginal improvements over a more standard approach of transfer learning from a model trained on natural images. Yannik Glaser · Peter Sadowski · Justin Stopa 🔗 - Emulating cosmological multifields with generative adversarial networks (Poster) []  We explore the possibility of using deep learning to generate multifield images from state-of-the-art hydrodynamic simulations of the CAMELS project. We use a generative adversarial network to generate images with three different channels that represent gas density (Mgas), neutral hydrogen density (HI), and magnetic field amplitudes (B). The quality of each map in each example generated by the model looks very promising. The GAN considered in this study is able to generate maps whose mean and standard deviation of the probability density distribution of the pixels are consistent with those of the maps from the training data. The mean and standard deviation of the auto power spectra of the generated maps of each field agree well with those computed from the maps of IllustrisTNG. Moreover, the cross-correlations between fields in all instances produced by the emulator are in good agreement with those of the dataset. This implies that all three maps in each output of the generator encode the same underlying cosmology and astrophysics. Sambatra Andrianomena · Sultan Hassan · Francisco Villaescusa-Navarro 🔗 - Monte Carlo Techniques for Addressing Large Errors and Missing Data in Simulation-based Inference (Poster) []  Upcoming astronomical surveys will observe billions of galaxies across cosmic time, providing a unique opportunity to map the many pathways of galaxy assembly to an incredibly high resolution. However, the huge amount of data also poses an immediate computational challenge: current tools for inferring parameters from the light of galaxies take $\gtrsim$ 10 hours per fit. This is prohibitively expensive. Simulation-based Inference (SBI) is a promising solution. However, it requires simulated data with identical characteristics to the observed data, whereas real astronomical surveys are often highly heterogeneous, with missing observations and variable uncertainties determined by sky and telescope conditions. Here we present a Monte Carlo technique for treating out-of-distribution measurement errors and missing data using standard SBI tools. We show that out-of-distribution measurement errors can be approximated by using standard SBI evaluations, and that missing data can be marginalized over using SBI evaluations over nearby data realizations in the training set. While these techniques slow the inference process from $\sim$1 sec to $\sim$1.5 min per object, this is still significantly faster than standard approaches while also dramatically expanding the applicability of SBI. This expanded regime has broad implications for future applications to astronomical surveys. Bingjie Wang · Joel Leja · Victoria Villar · Joshua Speagle 🔗 - Towards Creating Benchmark Datasets of Universal Neural Network Potential for Material Discovery (Poster) []  Recently, neural network potentials (NNPs) have been shown to be particularly effective in conducting atomistic simulations for computational material discovery. Especially in recent years, large-scale datasets have begun to emerge for the purpose of ensuring versatility. However, we show that even with a large dataset and a model that achieves good validation accuracy, the resulting energy surface can be quite delicate and the easily reach unrealistic extrapolation regions during the simulation. We first demonstrate this behavior using a DimeNet++ trained on Open Catalyst 2020 dataset (OC20). Based on this observation, we propose a hypothesis that for NNP models to attain the versatality, the training dataset should contain a diverse set of virtual structures. To verify this, we have created a relatively much smaller benchmark dataset called "High-temperature Multi-Element 2021" (HME21) dataset, which was sampled through a high-temperature molecular dynamics simulation and has less prior information. We conduct benchmark experiments on HME21 and show that training a TeaNet on HME21 can achieve better performance in reproducing the absorption process, although HME21 does not contain corresponding atomic structures. Our findings indicates that dataset diversity can be more essential than the dataset quantity in training universal NNPs for material discovery. So Takamoto · Chikashi Shinagawa · Nontawat Charoenphakdee 🔗 - Physics-informed neural networks for modeling rate- and temperature-dependent plasticity (Poster) []  This work presents a physics-informed neural network (PINN) based framework to model the strain-rate and temperature dependence of the deformation fields in elastic-viscoplastic solids. To avoid unbalanced back-propagated gradients during training, the proposed framework uses a simple strategy with no added computational complexity for selecting scalar weights that balance the interplay between different terms in the physics-based loss function. In addition, we highlight a fundamental challenge involving the selection of appropriate model outputs so that the mechanical problem can be faithfully solved using a PINN-based approach. We demonstrate the effectiveness of this approach by studying two test problems modeling the elastic-viscoplastic deformation in solids at different strain rates and temperatures, respectively. Our results show that the proposed PINN-based approach can accurately predict the spatio-temporal evolution of deformation in elastic-viscoplastic materials. Rajat Arora · Pratik Kakkar · Amit Chakraborty · Biswadip Dey 🔗 - A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful (Poster) []  We present extensive empirical evidence showing that current Bayesian simulation-based inference algorithms can produce computationally unfaithful posterior approximations. Our results show that all benchmarked algorithms -- (S)NPE, (S)NRE, SNL and variants of ABC -- can yield overconfident posterior approximations, which makes them unreliable for scientific use cases and falsificationist inquiry. Failing to address this issue may reduce the range of applicability of simulation-based inference. For this reason, we argue that research efforts should be made towards theoretical and methodological developments of conservative approximate inference algorithms and present research directions towards this objective.In this regard, we show empirical evidence that ensembling posterior surrogates provides more reliable approximations and mitigates the issue. Joeri Hermans · Arnaud Delaunoy · François Rozet · Antoine Wehenkel · Volodimir Begy · Gilles Louppe 🔗 - Learning-based solutions to nonlinear hyperbolic PDEs: Empirical insights on generalization errors (Poster) []  We study learning weak solutions to nonlinear hyperbolic partial differential equations (H-PDE), which have been difficult to learn due to discontinuities in their solutions. We use a physics-informed variant of the Fourier Neural Operator ($\pi$-FNO) to learn the weak solutions. We empirically quantify the generalization/out-of-sample error of the $\pi$-FNO solver as a function of input complexity, i.e., the distributions of initial and boundary conditions. Our testing results show that $\pi$-FNO generalizes well to unseen initial and boundary conditions. We find that the generalization error grows linearly with input complexity. Further, adding the physics-informed regularizer improved the prediction of discontinuities in the solution. We use the Lighthill-Witham-Richards (LWR) traffic flow model as a guiding example to illustrate the results. Bilal Thonnam Thodi · Sai Venkata Ramana Ambadipudi · Saif Eddin Jabari 🔗 - Modeling halo and central galaxy orientations on the SO(3) manifold with score-based generative models (Poster) []  Upcoming cosmological weak lensing surveys are expected to constrain cosmo-logical parameters with unprecedented precision. In preparation for these surveys,large simulations with realistic galaxy populations are required to test and validateanalysis pipelines. However, these simulations are computationally very costly –and at the volumes and resolutions demanded by upcoming cosmological surveys,they are computationally infeasible. Here, we propose a Deep Generative Modelingapproach to address the specific problem of emulating realistic 3D galaxy orien-tations in synthetic catalogs. For this purpose, we develop a novel Score-BasedDiffusion Model specifically for the SO(3) manifold. The model accurately learnsand reproduces correlated orientations of galaxies and dark matter halos that arestatistically consistent with those of a reference high-resolution hydrodynamicalsimulation. Yesukhei Jagvaral · Francois Lanusse · Rachel Mandelbaum 🔗 - Improved Training of Physics-informed Neural Networks using Energy-Based priors: A Study on Electrical Impedance Tomography (Poster) []  Physics-informed neural networks (PINNs) are attracting significant attention for solving partial differential equation (PDE) based inverse problems, including electrical impedance tomography (EIT). EIT is non-linear and especially its inverse problem is highly ill-posed. Therefore, successful training of PINN is extremely sensitive to interplay between different loss terms and hyper-parameters, including the learning rate. In this work, we propose a Bayesian approach through data-driven energy-based model (EBM) as a prior, to improve the overall accuracy and quality of tomographic reconstruction. In particular, the EBM is trained over the possible solutions of the PDEs with different boundary conditions. By imparting such prior onto physics-based training, PINN convergence is expedited by more than ten times faster to the PDE’s solution. Evaluation outcome shows that our proposed method is more robust for solving the EIT problem. Akarsh Pokkunuru · Pedram Rooshenas · Thilo Strauss · Anuj Abhishek · Taufiquar Khan 🔗 - Geometric path augmentation for inference of sparsely observed stochastic nonlinear systems (Poster) []  Stochastic evolution equations describing the dynamics of systems under the influence of both deterministic and stochastic forces are prevalent in all fields of science.Yet, identifying these systems from sparse-in-time observations remains still a challenging endeavour.Existing approaches focus either on the temporal structure of the observations by relying on conditional expectations, discarding thereby information ingrained in the geometry of the system's invariant density; or employ geometric approximations of the invariant density, which are nevertheless restricted to systems with conservative forces. Here we propose a method that reconciles these two paradigms. We introduce a new data-driven path augmentation scheme that takes the local observation geometry into account. By employing non-parametric inference on the augmented paths, we can efficiently identify the deterministic driving forces of the underlying system for systems observed at low sampling rates. Dimitra Maoutsa 🔗 - How good is the Standard Model? Machine learning multivariate Goodness of Fit tests (Poster) []  We formulate the problem of detecting collective anomalies in collider experiments as a Goodness of Fit test of a reference hypothesis (the Standard Model) to the observed data. Several well established Goodness of Fit methods are available for one-dimensional problems but their multivariate generalisation is still object of study. We exploit machine learning to build a set of multivariate tests, starting from the outcome of a machine learned binary classifier trained to distinguish the experimental data from the reference expectations, as prescribed in Ref.s [1-4]. We compare typical one-dimensional test statistics computed on the output of the classifier with less common test statistics built out of standard classification metrics. In the considered setup, the likelihood-ratio test shows a broader model-independent sensitivity to the landscape of the signal benchmarks analysed. A novel test we define, based on event counting with an optimised classifier threshold, is found to perform slightly better than the likelihood-ratio test for resonant signal, but is exposed to strong failures for non-resonant ones. Gaia Grosso · Marco Letizia · Andrea Wulzer · Maurizio Pierini 🔗 - A probabilistic deep learning model to distinguish cusps and cores in dwarf galaxies (Poster) []  Numerical simulations within a cold dark matter (DM) cosmology form halos with a characteristic density profile with a logarithmic inner slope of -1. Various methods, such as Jeans and Schwarzschild modelling, have been used in an attempt to determine the inner density of observed dwarf galaxies, in order to test this theoretical prediction.Here, we develop a mixture density convolutional neural networks (MDCNNs) to derive a posterior distribution of the inner density slopes of DM halos. We train the MDCNN on a suite of simulated galaxies from the NIHAO and AURIGA projects, inputting line-of-sight velocities and 2D spatial information of the stars within simulated galaxies. The output of the MDCNN is a probability density function representing the posterior probability of a certain slope to be the correct one, thus producing accurate and complex information on the uncertainty of the predictions.The model recovers accurately the correct inner slope of dwarfs: around 82% of the galaxies have a derived inner slope within ±0.1 of their true value, while around 98% within ±0.3.We then apply our model to four Local Group dwarf spheroidal galaxies and find similar results to those obtained with the Jeans modelling based code GravSphere. Julen Expósito-Márquez · Marc Huertas-Company · Arianna Di Cintio · Chris Brook · Andrea Macciò · Rob Grant · Elena Arjona 🔗 - Super-resolving Dark Matter Halos using Generative Deep Learning (Poster) []  Generative deep learning methods built upon Convolutional Neural Networks (CNNs) provide great tools for predicting non-linear structure in cosmology. In this work we predict high resolution dark matter halos from large scale, low resolution dark matter only simulations. This is achieved by mapping lower resolution to higher resolution density fields of simulations sharing the same cosmology, initial conditions and box-sizes. To resolve structure down to a factor of 8 increase in mass resolution, we use a variation of U-Net with a conditional Generative Adversarial Network (GAN), generating output that visually and statistically matches the high resolution target extremely well. This suggests that our method can be used to create high resolution density output over Gpc/h box-sizes from low resolution simulations with negligible computational effort. David Schaurecker 🔗 - Using Shadows to Learn Ground State Properties of Quantum Hamiltonians (Poster) []  Predicting properties of the ground state of a given quantum Hamiltonian is an important task central to various fields of science. Recent theoretical results show that for this task learning algorithms enjoy an advantage over non-learning algorithms for a wide range of important Hamiltonians. This work investigates whether the graph structure of these Hamiltonians can be leveraged for the design of sample efficient machine learning models. We demonstrate that corresponding Graph Neural Networks do indeed exhibit superior sample efficiency. Our results provide guidance in the design of machine learning models that learn on experimental data from near-term quantum devices. Viet T. Tran · Laura Lewis · Johannes Kofler · Hsin-Yuan Huang · Richard Kueng · Sepp Hochreiter · Sebastian Lehner 🔗 - Set-Conditional Set Generation for Particle Physics (Poster) []  The simulation of particle physics data is a fundamental but computationally intensive ingredient for physics analysis at the Large Hadron Collider, where observational set-valued data is generated conditional on a set of incoming particles. To accelerate this task, we present an novel generative model based on graph neural network and slot-attention components, which exceeds the performance of pre-existing baselines. Sanmay Ganguly · Lukas Heinrich · Nilotpal Kakati · Nathalie Soybelman 🔗 - Score Matching via Differentiable Physics (Poster) []  Diffusion models based on stochastic differential equations (SDEs) gradually perturb a data distribution $p(\mathbf{x})$ over time by adding noise to it. A neural network is trained to approximate the score $\nabla_\mathbf{x} \log p_t(\mathbf{x})$ at time $t$, which can be used to reverse the corruption process. In this paper, we focus on learning the score field that is associated with the time evolution according to a physics operator in the presence of natural non-deterministic physical processes like diffusion. A decisive difference to previous methods is that the SDE underlying our approach transforms the state of a physical system to another state at a later time. For that purpose, we replace the drift of the underlying SDE formulation with a differentiable simulator or a neural network approximation of the physics. At the core of our method, we optimize the so-called probability flow ODE to fit a training set of simulation trajectories inside an ODE solver and solve the reverse-time SDE for inference to sample plausible trajectories that evolve towards a given end state. Benjamin Holzschuh · Simona Vegetti · Nils Thuerey 🔗 - Adaptive Selection of Atomic Fingerprints for High-Dimensional Neural Network Potentials (Poster) []  Molecular dynamics simulations of solidification phenomena require accuraterepresentations of solid and liquid phases, making classical force fields oftenunsuitable. On the other hand ab initio simulations are infeasible to observe rarenucleation events. Being able to recreate ab initio quality forces, at scalability andefficiency near that of classical force fields, simulation of solidification processesis a promising area of application for machine-learned interatomic force fields.In a neural network potential the choice of input features plays a vital part in itsperformance. Here we propose embedded feature selection, using the adaptivegroup lasso technique, for identifying and removing irrelevant atomic fingerprints. Johannes Sandberg · Emilie Devijver · Noel Jakse · Thomas Voigtmann 🔗 - HyperFNO: Improving the Generalization Behavior of Fourier Neural Operators (Poster) []  Physic-informed machine learning aims to build surrogate models for real-world physical systems governed by partial differentiable equations (PDEs). One of the more popular recently proposed approaches is the Fourier Neural Operator (FNO), which learns the Green's function operator for PDEs based only on observational data. These operators are able to model PDEs for a variety of initial conditions and show the ability of multi-scale prediction. % However, as we will show, this model class is not able to model a high variation of the parameters of some PDEs.However, as we will show, this model class is not able to generalize to changes in the parameters of the PDEs, such as the viscosity coefficient or forcing term.We propose HyperFNO, an approach combining FNOs with hypernetworks so as to improve the models' extrapolation behavior to a wider range of PDE parameters using a single model. HyperFNO learns to generate the parameters of functions operating in both the original and the frequency domain. The proposed architecture is evaluated using various simulation problems. Francesco Alesiani · Makoto Takamoto · Mathias Niepert 🔗 - Normalizing Flows for Hierarchical Bayesian Analysis: A Gravitational Wave Population Study (Poster) []  We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO data release: primary mass, secondary mass, redshift, and effective spin. Our results show that despite the small and notoriously noisy dataset, the posterior predictive distributions (assuming a prior over the free parameters of the flow) of the observed gravitational wave population recover structure that agrees with robust previous phenomenological modeling results while being less susceptible to biases introduced by less-flexible distribution models. Therefore, the method forms a promising flexible, reliable replacement for population inference distributions, even when data is highly noisy. David Ruhe · Kaze Wong · Miles Cranmer · Patrick Forré 🔗 - Fast kinematics modeling for conjunction with lens image modeling (Poster) []  Galaxy kinematics modeling is currently the computational bottleneck for a joint gravitational lensing+kinematics modeling procedure. We present as a proof of concept the Stellar Kinematics Neural Network (SKiNN), which emulates kinematics calculations for the context of gravitational lens modeling. After a one-time upfront training cost, SKiNN creates velocity dispersion images which are accurate to $\lesssim1\%$ within the region of interest at a speed $\mathcal{O}(10^2-10^3)$ times faster than existing kinematics modeling methods. This speedup makes it feasible to jointly model lensing data with spatially resolved kinematic data, which corrects for the largest source of uncertainty in the determination of the Hubble constant. Matthew Gomer · Luca Biggio · Sebastian Ertl · Han Wang · Aymeric Galan · Lyne Van de Vyvere · Dominique Sluse · Georgios Vernardos · Sherry Suyu 🔗 - Multi-Fidelity Transfer Learning for accurate database PDE approximation (Poster) []  Data-driven approaches to accelerate computation time on PDE-based physical problems have recently received growing interest. Deep Learning algorithms are applied to learn from samples of accurate approximations of the PDEs solutions computed by numerical solvers. However, generating a large-scale dataset with accurate solutions using these classical solvers remains challenging due to their high computational cost. In this work, we propose a multi-fidelity transfer learning approach that combines a large amount of low-cost data from poor approximations with a small but accurately computed dataset. Experiments on two physical problems (airfoil flow and wheel contact) show that by transferring prior-knowledge learned from the inaccurate dataset, our approach can predict well PDEs solutions, even when only a few samples of highly accurate solutions are available. Wenzhuo LIU · Mouadh Yagoubi · Marc Schoenauer · David Danan 🔗 - Learning Electron Bunch Distribution along a FEL Beamline by Normalising Flows (Poster) []  Understanding and control of Laser-driven Free Electron Lasers remain to be difficult problems that require highly intensive experimental and theoretical research. The gap between simulated and experimentally collected data might complicate studies and interpretation of obtained results. In this work we developed a deep learning based surrogate that could help to fill in this gap. We introduce a surrogate model based on normalising flows for conditional phase-space representation of electron clouds in a FEL beamline. Achieved results let us discuss further benefits and limitations in exploitability of the models to gain deeper understanding of fundamental processes within a beamline. Anna Willmann · Jurjen Pieter Couperus Cabadağ · Yen-Yu Chang · Richard Pausch · Amin Ghaith · Alexander Debus · Arie Irman · Michael Bussmann · Ulrich Schramm · Nico Hoffmann 🔗 - Continual learning autoencoder training for a particle-in-cell simulation via streaming (Poster) []  The upcoming exascale era will provide a new generation of physics simulations. These simulations will have a high spatiotemporal resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. Therefore, we need to rethink the training of machine learning models for simulations for the upcoming exascale era. This work presents an approach that trains a neural network concurrently to a running simulation without storing data on a disk. The training pipeline accesses the training data by in-memory streaming. Furthermore, we apply methods from the domain of continual learning to enhance the generalization of the model. We tested our pipeline on the training of a 3d autoencoder trained concurrently to laser wakefield acceleration particle-in-cell simulation. Furthermore, we experimented with various continual learning methods and their effect on the generalization. Patrick Stiller · Varun Makdani · Franz Poeschel · Richard Pausch · Alexander Debus · Michael Bussmann · Nico Hoffmann 🔗 - On Using Deep Learning Proxies as Forward Models in Optimization Problems (Poster) []  Physics-based optimization problems are generally very time-consuming, especially due to the computational complexity associated with the forward model. Recent works have demonstrated that physics-modelling can be approximated with neural networks. However, there is always a certain degree of error associated with this learning, and we study this aspect in this paper. We demonstrate through experiments on popular mathematical benchmarks, that neural network approximations (NN-proxies) of such functions when plugged into the optimization framework, can lead to erroneous results. In particular, we study the behaviour of particle swarm optimization and genetic algorithm methods and analyze their stability when coupled with NN-proxies. The correctness of the approximate model depends on the extent of sampling conducted in the parameter space, and through numerical experiments, we demonstrate that caution needs to be taken when constructing this landscape with neural networks. Further, the NN-proxies are hard to train for higher dimensional functions, and we present our insights for 4D and 10D problems. The error is higher for such cases, and we demonstrate that it is sensitive to the choice of the sampling scheme used to build the NN-proxy. The code is available at https://github.com/Fa-ti-ma/NN-proxy-in-optimization. Fatima Albreiki · Nidhal Belayouni · Deepak Gupta 🔗 - HGPflow: Particle reconstruction as hyperedge prediction (Poster) []  We approach particle reconstruction in collider experiments as a set-to-set problem and show the efficacy of a deep-learning model that predicts hypergraph incidence structure. This model outperforms a benchmark parameterized algorithm in predicting the momentum of particle jets and shows an ability to disentangle individual neutral particles in the collimated environment. Representing particles as hyperedges on the set of input nodes introduces an inductive bias that predisposes the predictions to conserve energy and thus promotes accurate, interpretable results. Etienne Dreyer · Nilotpal Kakati · Francesco Armando Di Bello 🔗 - Anomaly Detection with Multiple Reference Datasets in High Energy Physics (Poster) []  An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage of commonly-available multiple datasets and thus cannot fully exploit available information. In this work, we propose generalizations of CWoLa and SALAD for settings where multiple reference datasets are available, building on weak supervision techniques. We demonstrate improved performance in a number of settings with real and synthetic data. As an added benefit, our generalizations enable us to provide finite-sample guarantees, improving on existing asymptotic analyses. Mayee Chen · Benjamin Nachman · Frederic Sala 🔗 - Do Better QM9 Models Extrapolate as Better Quantum Chemical Property Predictors? (Poster) []  The implicit hypothesis behind benchmarking on the gold standard QM9 dataset is that, model improvement on small and concentrated molecules implies improvement in generalization as better quantum chemical property (QCP) predictors. This extrapolation ability for deep learning (DL) models is highly useful for various real-world applications, yet the related investigation remains quite limited. The goal of this paper is to promote the development of DL models that can extrapolate beyond the in-domain dataset, and can handle larger molecules than that of the training data. To achieve this goal, a cross-dataset benchmark of training models on QM9 dataset and testing on ALchemy datasets with Larger molecular size (QMALL) is proposed. Experimental results using recent DL methods are provided to investigate their out-of-distribution (OOD) behavior. Analysis of the overall performance drop, model ranking inconsistency, aggregation method selection, and error patterns created new insights into this OOD extrapolation issue, highlighting its challenge for the research community to tackle. YUCHENG ZHANG · Nontawat Charoenphakdee · So Takamoto 🔗 - Diversity Balancing Generative Adversarial Networks for fast simulation of the Zero Degree Calorimeter in the ALICE experiment at CERN (Poster) []  Generative Adversarial Networks (GANs) are powerful models able to synthesize data samples closely resembling the distribution of real data, yet the diversity of those generated samples is limited due to the so-called mode collapse phenomenon observed in GANs. Conditional GANs are especially prone to mode collapse, as they tend to ignore the input noise vector and focus on the conditional information. Recent methods proposed to mitigate this limitation increase the diversity of generated samples, yet they reduce the performance of the models when similarity of samples is required. To address this shortcoming, we propose a novel method to control the diversity of GAN-generated samples. By adding a simple, yet effective regularization to the training loss function we encourage the generator to discover new data modes for inputs related to diverse outputs while generating consistent samples for the remaining ones. More precisely, we reward or penalize the model for synthesising diverse images, matching the diversity of real and generated samples for a given conditional input. We show the superiority of our method on simulating data from the Zero Degree Calorimeter of the ALICE experiment in LHC, CERN. Jan Dubiński · Kamil Deja · Sandro Wenzel · Przemysław Rokita · Tomasz Trzcinski 🔗 - Identifying Hamiltonian Manifold in Neural Networks (Poster) []  Recent studies to learn physical laws via deep learning attempts to find the shared representation of the given system by introducing physics priors or inductive biases to the neural network. However most of these approaches tackle the problem in a system-specific manner, in which one neural network trained to one particular physical system cannot be easily adapted to another system governed by a different physical law. In this work, we use a meta-learning algorithm to identify the general manifold in neural networks that represents the Hamilton's equation. We meta-trained the model with the dataset composed of five dynamical systems each governed by different physical laws. We show that with only a few gradient steps, the meta-trained model adapts well to the physical system which was unseen during the meta-training phase. Our results suggest that the meta-trained model can craft the representation of Hamilton's equation in neural networks which is shared across various dynamical systems with each governed by different physical laws. Yeongwoo Song · Hawoong Jeong 🔗 - Physics-Informed Neural Networks as Solvers for the Time-Dependent Schrödinger Equation (Poster) []  We demonstrate the utility of physics-informed neural networks (PINNs) as solvers for the non-relativistic, time-dependent Schrödinger equation. We study the performance and generalisability of PINN solvers on the time evolution of a quantum harmonic oscillator across varying system parameters, domains, and energy states. Karan Shah · Patrick Stiller · Nico Hoffmann · Attila Cangi 🔗 - Time-aware Bayesian optimization for adaptive particle accelerator tuning (Poster) []  Particle accelerators require continuous adjustment to maintain beam quality. At the Advanced Photon Source (APS) synchrotron facility this is accomplished using a mix of operator-controlled and automated tools. We have recently implemented Bayesian optimization (BO) as one of automated options, significantly improving sampling efficiency. However, poor BO performance was observed in certain scenarios due to time-dependent device drifts. In this work, we discuss extending BO to an adaptive version (ABO) that can compensate for distribution drifts through explicit time-awareness, enabling long-term online operational use. Our contributions include advanced kernels with physics-informed time dimension structure, age-biased data history subsampling, and auxiliary time-aware safety constraint models. Benchmarks show better ABO performance in several simulated and experimental tests. Our results are an encouraging step for the wider adoption of ML-based optimizers at APS. Nikita Kuklev · Yine Sun · Hairong Shang · Michael Borland · Gregory Fystro 🔗 - Inferring molecular complexity from mass spectrometry data using machine learning (Poster) []  Molecular complexity has been proposed as a potential agnostic biosignature — in other words: a way to search for signs of life beyond Earth without relying on “life as we know it.” More than one way to compute molecular complexity has been proposed, so comparing their performance in evaluating experimental data collected in situ, such as on board a probe or rover exploring another planet, is imperative. Here, we report the results of an attempt to deploy multiple machine learning (ML) techniques to predict molecular complexity scores directly from mass spectrometry data. Our initial results are encouraging and may provide fruitful guidance toward determining which complexity measures are best suited for use with experimental data. Beyond the search for signs of life, this approach is likewise valuable for studying the chemical composition of samples to assist decisions made by the rover or probe, and may thus contribute toward supporting the need for greater autonomy. Timothy Gebhard · Aaron Bell · Jian Gong · Jaden J. A. Hastings · George Fricke · Nathalie Cabrol · Scott Sandford · Michael Phillips · Kimberley Warren-Rhodes · Atilim Gunes Baydin 🔗 - A physics-informed search for metric solutions to Ricci flow, their embeddings, and visualisation (Poster) []  Neural networks with PDEs embedded in their loss functions (physics-informed neural networks) are employed as a function approximators to find solutions to the Ricci flow (a curvature based evolution) of Riemannian metrics. A general method is developed and applied to the real torus. The validity of the solution is verified by comparing the time evolution of scalar curvature with that found using a standard PDE solver, which decreases to a constant value of 0 on the whole manifold. We also consider certain solitonic solutions to the Ricci flow equation in two real dimensions. We create visualisations of the flow by utilising an embedding into $\mathbb{R}^3$. Snapshots of highly accurate numerical evolution of the toroidal metric over time are reported. We provide guidelines on applications of this methodology to the problem of determining Ricci flat Calabi--Yau metrics in the context of String theory, a long standing problem in complex geometry. Aarjav Jain · Challenger Mishra · Pietro Lió 🔗 - Detection is truncation: studying source populations with truncated marginal neural ratio estimation (Poster) []  Statistical inference of population parameters of astrophysical sources is challenging. It requires accounting for selection effects, which stem from the artificial separation between bright detected and dim undetected sources that is introduced by the analysis pipeline itself. We show that these effects can be modeled self-consistently in the context of sequential simulation-based inference. Our approach couples source detection and catalogue-based inference in a principled framework that derives from the truncated marginal neural ratio estimation (TMNRE) algorithm. It relies on the realization that detection can be interpreted as prior truncation. We outline the algorithm, and show first promising results. Noemi Anau Montel · Christoph Weniger 🔗 - Galaxy Morphological Classification with Deformable Attention Transformer (Poster) []  Galaxy morphological classification is an important but challenging task in astronomy. Most prior work study coarse-level morphological classification and use raster low-dynamic range images, but we are interested in high-dynamic range images commonly produced in imaging surveys. To tackle this problem, first we build a dataset with high dynamic range for fine-level multi-class classification that are even challenging to human eyes. Then we propose to use Deformable Attention Transformer for this difficult task with five-bands images and masks, and in the experimental results our model achieves about 70% and 94% for top-1 and top-2 test set accuracies, respectively. We also visualize attention maps and analysis the results with respect to different classes and mask sizes to understand the data and behavior of the model. We confirm that our model has similar confusion patterns in confusion matrix as human along with attention visualization for capturing morphological characteristics. SEOKUN KANG · Min-Su Shin · Taehwan Kim 🔗 - Towards solving model bias in cosmic shear forward modeling (Poster) []  As the volume and quality of modern galaxy surveys increase, so does the difficulty of measuring the cosmological signal imprinted in galaxy shapes. Weak gravitational lensing sourced by the most massive structures in the Universe generates a slight shearing of galaxy morphologies called cosmic shear, key probe for cosmological models. Modern techniques of shear estimation based on statistics of ellipticity measurements suffer from the fact that the ellipticity is not a well-defined quantity for arbitrary galaxy light profiles, biasing the shear estimation. We show that a hybrid physical and deep learning Hierarchical Bayesian Model, where a generative model captures the galaxy morphology, enables us to recover an unbiased estimate of the shear on realistic galaxies, thus solving the model bias. Benjamin Remy · Francois Lanusse · Jean-Luc Starck 🔗 - Physical Data Models in Machine Learning Imaging Pipelines (Poster) Light propagates from the object through the optics up to the sensor to create an image. Once the raw data is collected, it is processed through a complex image signal processing (ISP) pipeline to produce an image compatible with human perception. However, this processing is rarely considered in machine learning modelling because available benchmark data sets are generally not in raw format. This study shows how to embed the forward acquisition process into the machine learning model. We consider the optical system and the ISP separately. Following the acquisition process, we start from a drone and airship image dataset to emulate realistic satellite raw images with on-demand parameters. The end-to-end process is built to resemble the optics and sensor of the satellite setup. These parameters are satellite mirror size, focal length, pixel size and pattern, exposure time and atmospheric haze. After raw data collection, the ISP plays a crucial role in neural network robustness. We jointly optimize a parameterized differentiable image processing pipeline with a neural network model. This can lead to speed up and stabilization of classifier training at a margin of up to 20\% in validation accuracy. Marco Aversa · Luis Oala · Christoph Clausen · Roderick Murray-Smith · Bruno Sanguinetti 🔗 - Amortized Bayesian Inference of GISAXS Data with Normalizing Flows (Poster) []  Grazing-Incidence Small-Angle X-ray Scattering (GISAXS) is a modern imaging technique used in material research to study nanoscale materials. Reconstruction of the parameters of an imaged object imposes an ill-posed inverse problem that is further complicated when only an in-plane GISAXS signal is available. Traditionally used inference algorithms such as Approximate Bayesian Computation (ABC) rely on computationally expensive scattering simulation software, rendering analysis highly time-consuming. We propose a simulation-based framework that combines variational auto-encoders and normalizing flows to estimate the posterior distribution of object parameters given its GISAXS data. We apply the inference pipeline to experimental data and demonstrate that our method reduces the inference cost by orders of magnitude while producing consistent results with ABC. Maksim Zhdanov · Lisa Randolph · Thomas Kluge · Motoaki Nakatsutsumi · Christian Gutt · Marina Ganeva · Nico Hoffmann 🔗 - Insight into cloud processes from unsupervised classification with a rotation-invariant autoencoder (Poster) []  Clouds play a critical role in the Earth’s energy budget and their potential changes are one of the largest uncertainties in future climate projections. However, the use of satellite observations to understand cloud feedbacks in a warming climate has been hampered by the simplicity of existing cloud classification schemes, which are based on single-pixel cloud properties rather than utilizing spatial structures and textures. Recent advances in computer vision enable the grouping of different patterns of images without using human-predefined labels, providing a novel means of automated cloud classification. This unsupervised learning approach allows discovery of unknown climate-relevant cloud patterns, and the automated processing of large datasets. We describe here the use of such methods to generate a new AI-driven Cloud Classification Atlas (AICCA), which leverages 22 years and 800 terabytes of MODIS satellite observations over the global ocean. We use a rotation-invariant cloud clustering (RICC) method to classify those observations into 42 AI-generated cloud class labels at ~ 100 km spatial resolution. As a case study, we use AICCA to examine a recent finding of decreasing cloudiness in a critical part of the subtropical stratocumulus deck, and show that the change is accompanied by strong trends in cloud classes. Takuya Kurihana · James Franke · Ian Foster · Ziwei Wang · Elisabeth Moyer 🔗 - Addressing out-of-distribution data for flow-based gravitational wave inference (Poster) []  Simulation-based inference and normalizing flows have recently demonstrated excellent performance when applied to gravitational-wave parameter estimation. These methods can provide accurate results within seconds, in cases where classical methods based on stochastic samplers may take days or even weeks. However, such methods are typically based on deep neural networks and thus unable to reliably deal with out-of-distribution data, such as may arise when predicted signal and noise models do not precisely fit observations. We here present two innovations to deal with this challenge. First, we introduce a probabilistic noise model to augment the training data, making the inference network substantially more robust to distribution shifts in experimental noise. Second, we apply importance sampling to independently verify and correct inference results. This compensates for network inaccuracies and flags failure cases via low sample efficiencies. We expect these methods to be key components for the integration of deep learning techniques into production pipelines for gravitational-wave analysis. Maximilian Dax · Stephen Green · Jonas Wildberger · Jonathan Gair · Michael Puerrer · Jakob Macke · Alessandra Buonanno · Bernhard Schölkopf 🔗 - A fast and flexible machine learning approach to data quality monitoring (Poster) []  We present a machine learning based approach for real-time monitoring of particle detectors. The proposed strategy evaluates the compatibility between incoming batches of experimental data and a reference sample representing the data behavior in normal conditions by implementing a likelihood-ratio hypothesis test. The core model is powered by recent large-scale implementations of kernel methods, nonparametric learning algorithms that can approximate any continuous function given enough data. The resulting algorithm is fast, efficient and agnostic about the type of potential anomaly in the data. We show the performance of the model on multivariate data from a drift tube chambers muon detector. Marco Letizia · Gaia Grosso · Andrea Wulzer · Marco Zanetti · Jacopo Pazzini · Marco Rando · Nicolò Lai 🔗 - Cosmology from Galaxy Redshift Surveys with PointNet (Poster) []  []  In recent years, deep learning approaches have achieved state-of-the-art results in the analysis of point cloud data. In cosmology, galaxy redshift surveys resemble such a permutation invariant collection of positions in space. These surveys have so far mostly been analysed with two-point statistics, such as power spectra and correlation functions. The usage of these summary statistics is best justified on large scales, where the density field is linear and Gaussian. However, in light of the increased precision expected from upcoming surveys, the analysis of -- intrinsically non-Gaussian -- small angular separations represents an appealing avenue to better constrain cosmological parameters. In this work, we aim to improve upon two-point statistics by employing a \textit{PointNet}-like neural network to regress the values of the cosmological parameters directly from point cloud data. Our implementation of PointNets can analyse inputs of $\mathcal{O}(10^4) - \mathcal{O}(10^5)$ galaxies at a time, which improves upon earlier work for this application by roughly two orders of magnitude. Additionally, we demonstrate the ability to analyse galaxy redshift survey data on the lightcone, as opposed to previously static simulation boxes at a given fixed redshift. Sotiris Anagnostidis · Arne Thomsen · Alexandre Refregier · Tomasz Kacprzak · Luca Biggio · Thomas Hofmann · Tilman Tröster 🔗 - Finding active galactic nuclei through Fink (Poster) []  We present the Active Galactic Nuclei (AGN) classifier as currently implementedwithin the Fink broker. Features were built upon summary statistics of availablephotometric points, as well as color estimation enabled by symbolic regression. Thelearning stage includes an active learning loop, used to build an optimized trainingsample from labels reported in astronomical catalogs. Using this method to classifyreal alerts from the Zwicky Transient Facility (ZTF), we achieved 98.0% accuracy,93.8% precision and 88.5% recall. We also describe the modifications necessary toenable processing data from the upcoming Vera C. Rubin Observatory Large Surveyof Space and Time (LSST), and apply them to the training sample of the ExtendedLSST Astronomical Time-series Classification Challenge (ELAsTiCC). Resultsshow that our designed feature space enables high performances of traditionalmachine learning algorithms in this binary classification task. Etienne Russeil · Emille Ishida · Julien Peloton · Anais Möller · Roman Le Montagner 🔗

#### Author Information

##### Kyle Cranmer (University of Wisconsin-Madison)

Kyle Cranmer is an Associate Professor of Physics at New York University and affiliated with NYU's Center for Data Science. He is an experimental particle physicists working, primarily, on the Large Hadron Collider, based in Geneva, Switzerland. He was awarded the Presidential Early Career Award for Science and Engineering in 2007 and the National Science Foundation's Career Award in 2009. Professor Cranmer developed a framework that enables collaborative statistical modeling, which was used extensively for the discovery of the Higgs boson in July, 2012. His current interests are at the intersection of physics and machine learning and include inference in the context of intractable likelihoods, development of machine learning models imbued with physics knowledge, adversarial training for robustness to systematic uncertainty, the use of generative models in the physical sciences, and integration of reproducible workflows in the inference pipeline.