Timezone: »

Machine Learning for Engineering Modeling, Simulation and Design
Alex Beatson · Priya Donti · Amira Abdel-Rahman · Stephan Hoyer · Rose Yu · J. Zico Kolter · Ryan Adams

Sat Dec 12 04:50 AM -- 03:00 PM (PST) @ None
Event URL: https://ml4eng.github.io/ »

For full details see: https://ml4eng.github.io/

For questions, issues, and on-the-day help, email: ml4eng2020@gmail.com

gather.town link for poster sessions and breaks: https://neurips.gather.town/app/D2n0HkRXoVlgUSWV/ML4Eng-NeurIPS20

Modern engineering workflows are built on computational tools for specifying models and designs, for numerical analysis of system behavior, and for optimization, model-fitting and rational design. How can machine learning be used to empower the engineer and accelerate this workflow? We wish to bring together machine learning researchers and engineering academics to address the problem of developing ML tools which benefit engineering modeling, simulation and design, through reduction of required computational or human effort, through permitting new rich design spaces, through enabling production of superior designs, or through enabling new modes of interaction and new workflows.

Sat 4:50 a.m. - 5:00 a.m.
Opening Remarks (Live)
Sat 5:00 a.m. - 5:30 a.m.

Differentiable physics solvers (from the broader field of differentiable programming) show particular promise for including prior knowledge into machine learning algorithms. Differentiable operators were shown to be powerful tools to guide deep learning processes, and PDEs provide a wide range of components to build such operators. They also represent a natural way for traditional solvers and deep learning methods to coexist: Using PDE solvers as differentiable operators in neural networks allows us to leverage existing numerical methods for efficient solvers, e.g., to provide reliable and flexible gradients to update the weights during a learning run.

Interestingly, it turns out to be beneficial to combine "traditional" supervised and physics-based approaches. The former poses a much more straightforward and more stable learning task by providing explicit reference data, while physics-based learning can provide gradients for a larger space of states that are only encountered at training time. Here, differentiable solvers are particularly powerful, e.g., to provide neural networks with feedback about how inferred solutions influence a physical model's long-term behavior. I will show and discuss examples with various advection-diffusion type PDEs, among others the Navier-Stokes equations for fluids, for different learning applications. These demonstrations will highlight the properties and capabilities of PDE-powered deep neural networks and serve as a starting point for discussing future developments.

Bio: Nils is an Associate-Professor at the Technical University of Munich (TUM). He and his group focus on deep learning methods for physical simulations, with a particular focus on fluid phenomena. He acquired a Ph.D. for his work on liquid simulations in 2006 from the University of Erlangen-Nuremberg. Until 2010 he held a position as a post-doctoral researcher at ETH Zurich. He received a tech-Oscar from the AMPAS in 2013 for his research on controllable smoke effects. Subsequently, he worked for three years as R&D lead at ScanlineVFX, before starting at TUM in October 2013.

Nils Thuerey
Sat 5:30 a.m. - 5:40 a.m.
Nils Thuerey Q&A (Q&A)
Sat 5:40 a.m. - 6:10 a.m.

Understanding the generation of 3D shapes and scenes is fundamental to comprehensive perception and understanding of real-world environments. Recently, we have seen impressive progress in 3D shape generation and promising results in generating 3D scenes, largely relying on the availability of large-scale synthetic 3D datasets. However, the application to real-world scenes remains challenging due to the domain gap between synthetic and real 3D data. In this talk, I will discuss a self-supervised approach for 3D scene generation from partial RGB-D observations, and propose new techniques for self-supervised training for generating 3D geometry and color of scenes.

Bio: Angela Dai is an Assistant Professor at the Technical University of Munich. Her research focuses on understanding how the 3D world around us can be modeled and semantically understood. Previously, she received her PhD in computer science from Stanford in 2018 and her BSE in computer science from Princeton in 2013. Her research has been recognized through a ZDB Junior Research Group Award, an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention, as well as a Stanford Graduate Fellowship.

Angela Dai
Sat 6:10 a.m. - 6:20 a.m.
Angela Dai Q&A (Q&A)
Sat 6:20 a.m. - 8:20 a.m.
 link »

gather.town link: https://neurips.gather.town/app/D2n0HkRXoVlgUSWV/ML4Eng-NeurIPS20

Sat 8:20 a.m. - 8:50 a.m.

Our brains are able to exploit coarse physical models of fluids to quickly adapt and solve everyday manipulation tasks. However, developing such capability in robots, so that they can autonomously manipulate fluids adapting to different conditions remains a challenge. In this talk, I will present different strategies that a Robot can use to manipulate liquids by using approximate-but-fast simulation as an internal model. I'll describe strategies to pour and calibrate the parameters of the model from observations of real liquids with different viscosities via Bayesian Likelihood-free Inference. Finally, I'll present a methodology to learn the relevant parameters of a pouring task via Inverse Value Estimation and describe potential applications of the learned posterior to reason about containers and safety.

Bio: Tatiana Lopez-Guevara is a final year PhD student in Robotics and Autonomous Systems at the Edinburgh Centre for Robotics, UK. Her interests are in the application of intuitive physics models for robotic reasoning and manipulation of deformable objects.

Tatiana Lopez-Guevara
Sat 8:50 a.m. - 9:00 a.m.
Tatiana Lopez-Guevara Q&A (Q&A)
Sat 9:00 a.m. - 9:30 a.m.

This talk will describe various ways of using structured machine learning models for predicting complex physical dynamics, generating realistic objects, and constructing physical scenes. The key insight is that many systems can be represented as graphs with nodes connected by edges, which can be processed by graph neural networks and transformer-based models. The goal of the talk is to show how structured approaches are making advances in solving increasingly challenging problems in engineering, graphics, and everyday interactions with the world.

Bio: Peter Battaglia is a research scientist at DeepMind. He earned his PhD in Psychology at the University of Minnesota, and was later a postdoc and research scientist in MIT's Department of Brain and Cognitive Sciences. His current work focuses on approaches for reasoning about and interacting with complex systems, by combining richly structured knowledge with flexible learning algorithms.

Peter Battaglia
Sat 9:30 a.m. - 9:40 a.m.
Peter Battaglia Q&A (Q&A)
Sat 9:40 a.m. - 10:30 a.m.
 link »

gather.town room will remain open for people who wish to socialize / network during the break: https://neurips.gather.town/app/D2n0HkRXoVlgUSWV/ML4Eng-NeurIPS20

Sat 10:30 a.m. - 11:30 a.m.
Panel discussion with invited speakers (Panel discussion)
Sat 11:30 a.m. - 12:00 p.m.

Model reduction methods have grown from the computational science community, with a focus on reducing high-dimensional models that arise from physics-based modeling, whereas machine learning has grown from the computer science community, with a focus on creating expressive models from black-box data streams. Yet recent years have seen an increased blending of the two perspectives and a recognition of the associated opportunities. This talk presents our work in operator inference, where we learn effective reduced-order operators directly from data. The physical governing equations define the form of the model we should seek to learn. Thus, rather than learn a generic approximation with weak enforcement of the physics, we learn low-dimensional operators whose structure is defined by the physics. This perspective provides new opportunities to learn from data through the lens of physics-based models and contributes to the foundations of Scientific Machine Learning, yielding a new class of flexible data-driven methods that support high-consequence decision-making under uncertainty for physical systems.

Bio: Karen E. Willcox is Director of the Oden Institute for Computational Engineering and Sciences, Associate Vice President for Research, and Professor of Aerospace Engineering and Engineering Mechanics at the University of Texas at Austin. She is also External Professor at the Santa Fe Institute. Before joining the Oden Institute in 2018, she spent 17 years as a professor at the Massachusetts Institute of Technology, where she served as the founding Co-Director of the MIT Center for Computational Engineering and the Associate Head of the MIT Department of Aeronautics and Astronautics. Prior to joining the MIT faculty, she worked at Boeing Phantom Works with the Blended-Wing-Body aircraft design group. She is a Fellow of the Society for Industrial and Applied Mathematics (SIAM) and Fellow of the American Institute of Aeronautics and Astronautics (AIAA).

Karen Willcox
Sat 12:00 p.m. - 12:10 p.m.
Karen E Willcox Q&A (Q&A)
Sat 12:10 p.m. - 12:40 p.m.

Developments in computation spurred the fourth paradigm of materials discovery and design using artificial intelligence. Our research aims to advance design and manufacturing processes to create the next generation of high-performance engineering and biological materials by harnessing techniques integrating artificial intelligence, multiphysics modeling, and multiscale experimental characterization. This work combines computational methods and algorithms to investigate design principles and mechanisms embedded in materials with superior properties, including bioinspired materials. Additionally, we develop and implement deep learning algorithms to detect and resolve problems in current additive manufacturing technologies, allowing for automated quality assessment and the creation of functional and reliable structural materials. These advances will find applications in robotic devices, energy storage technologies, orthopedic implants, among many others. In the future, this algorithmically driven approach will enable materials-by-design of complex architectures, opening up new avenues of research on advanced materials with specific functions and desired properties.

Bio: Grace X. Gu is an Assistant Professor of Mechanical Engineering at the University of California, Berkeley. She received her PhD and MS in Mechanical Engineering from the Massachusetts Institute of Technology and her BS in Mechanical Engineering from the University of Michigan, Ann Arbor. Her current research focuses on creating new materials with superior properties for mechanical, biological, and energy applications using multiphysics modeling, artificial intelligence, and high-throughput computing, as well as developing intelligent additive manufacturing technologies to realize complex material designs previously impossible. Gu is the recipient of several awards, including the 3M Non-Tenured Faculty Award, MIT Tech Review Innovators Under 35, Johnson & Johnson Women in STEM2D Scholars Award, Royal Society of Chemistry Materials Horizons Outstanding Paper Prize, and SME Outstanding Young Manufacturing Engineer Award.

Grace Gu
Sat 12:35 p.m. - 12:50 p.m.
Grace X Gu Q&A (Q&A)
Sat 12:50 p.m. - 1:00 p.m.
Closing remarks (Live)
Sat 1:00 p.m. - 3:00 p.m.
 link »

gather.town link: https://neurips.gather.town/app/D2n0HkRXoVlgUSWV/ML4Eng-NeurIPS20

The gather.town room will remain live past official 6pm EST finish time: attendees who wish may stay to discuss, network and socialize as long as they like.


Heating, ventilation and air-conditioning (HVAC) systems can have a significant impact on the driving range of battery electric vehicles (EV’s). Predicting thermal comfort in an automotive vehicle cabin’s highly asymmetric and dynamic thermal environment is critical for developing energy-efficient HVAC systems. In this study we have coupled high-fidelity Computational Fluid Dynamics (CFD) simulations and Artificial Neural Networks (ANN) to predict vehicle occupant thermal comfort for any combination of steady-state boundary conditions. A vehicle cabin CFD model, validated against climatic wind tunnel measurements, was used to systematically generate training and test data that spanned the entire range of boundary conditions which impact occupant thermal comfort in an electric vehicle. Artificial neural networks (ANN) were applied to the simulation data to predict the overall Equivalent Homogeneous Temperature (EHT) comfort index for each occupant. An ensemble of five neural network models was able to achieve a mean absolute error of 2 ºC or less in predicting the overall EHT for all occupants in the vehicle on unseen or test data, which is acceptable for rapid evaluation and optimization of thermal comfort energy demand. The deep learning model developed in this work enables predictions of thermal comfort for any combination of steady-state boundary conditions in real-time without being limited by time-consuming and expensive CFD simulations or climatic wind tunnel tests. This model has been deployed as an easy-to-use web application within the organization for HVAC engineers to optimize thermal comfort energy demand and, thereby, driving range of electric vehicle programs.

Alok Warey, Shailendra Kaushik, Bahram Khalighi, Michael Cruse, Ganesh Venkatesan

Despite being at the heart of many optimal power flow solvers, Newton-Raphson can suffer from slow and numerically unstable Jacobian matrix inversions at each iteration. To reduce the computational burden associated with calculating the full Jacobian and its inverse, many Quasi-Newton methods attempt to find a solution to the optimality conditions by leveraging an approximate Jacobian matrix. In this paper, a Quasi-Newton method based on machine learning is presented which performs iterative updates for candidate optimal solutions without having to calculate a Jacobian or approximate Jacobian matrix. The resulting learning-based algorithm utilizes a deep neural network with feedback. With proper choice of weights and activation functions, the model becomes a contraction mapping and convergence can be guaranteed. Results demonstrated on networks up to 1,354 buses indicate the proposed method is capable of finding approximate solutions to AC OPF faster than Newton-Raphson, but can suffer from infeasibile solutions in large networks.

Kyri Baker

In this paper, we introduce an efficient backpropagation scheme for non-constrained implicit functions. These functions are parametrized by a set of learnable weights and may optionally depend on some input; making them perfectly suitable as learnable layer in a neural network. We demonstrate our scheme on different applications: (i) neural ODEs with the implicit Euler method, and (ii) system identification in model predictive control.

Andreas Look, Simona Doneva, Melih Kandemir, Rainer Gemulla, Jan Peters

Deriving fast and effectively coordinated control actions remains a grand challenge affecting the secure and economic operation of today’s large-scale power grid. This paper presents a novel artificial intelligence (AI) based methodology to achieve multi-objective real-time power grid control for real-world implementation. State-of-the-art off-policy reinforcement learning (RL) algorithm, soft actor-critic (SAC) is adopted to train AI agents with multi-thread offline training and periodic online training for regulating voltages and transmission losses without violating thermal constraints of lines. A software prototype was developed and deployed in the control center of SGCC Jiangsu Electric Power Company that interacts with their Energy Management System (EMS) every 5 minutes. Massive numerical studies using actual power grid snapshots in the real-time environment verify the effectiveness of the proposed approach. Well-trained SAC agents can learn to provide effective and subsecond (<20 ms) control actions in regulating voltage profiles and reducing transmission losses.

Ruisheng Diao, Di Shi, Bei Zhang, Siqi Wang, Haifeng Li, Chunlei Xu, Tu Lan, Desong Bian, Jiajun Duan, Zheng Wu

We propose a novel scheme for fitting heavily parameterized non-linear stochastic differential equations (SDEs). We assign a prior on the parameters of the SDE drift and diffusion functions to achieve a Bayesian model. We then infer this model using the well-known local reparameterized trick for the first time for empirical Bayes, i.e. to integrate out the SDE parameters. The model is then fit by maximizing the likelihood of the resultant marginal with respect to a potentially large number of hyperparameters, which prohibits stable training. As the prior parameters are marginalized, the model also no longer provides a principled means to incorporate prior knowledge. We overcome both of these drawbacks by deriving a training loss that comprises the marginal likelihood of the predictor and a PAC-Bayesian complexity penalty. We observe on synthetic as well as real-world time series prediction tasks that our method provides an improved model fit accompanied with favorable extrapolation properties when provided a partial description of the environment dynamics. Hence, we view the outcome as a promising attempt for building cutting-edge hybrid learning systems that effectively combine first-principle physics and data-driven approaches.

Manuel Haußmann, Sebastian Gerwinn, Andreas Look, Barbara Rakitsch, Melih Kandemir

Microstructural materials design is one of the most important applications of inverse modeling in materials science. Generally speaking, there are two broad modeling paradigms in scientific applications: forward and inverse. While the forward modeling estimates the observations based on known parameters, the inverse modeling attempts to infer the parameters given the observations. Inverse problems are usually more critical as well as difficult in scientific applications as they seek to explore the parameters that cannot be directly observed. Inverse problems are used extensively in various scientific fields, such as geophysics, healthcare and materials science. However, it is challenging to solve inverse problems, because they usually need to learn a one-to-many non-linear mapping, and also require significant computing time, especially for high-dimensional parameter space. Further, inverse problems become even more difficult to solve when the dimension of input (i.e. observation) is much lower than that of output (i.e. parameters). In this work, we propose a framework consisting of generative adversarial networks and mixture density networks for inverse modeling, and it is evaluated on a materials science dataset for microstructural materials design. Compared with baseline methods, the results demonstrate that the proposed framework can overcome the above-mentioned challenges and produce multiple promising solutions in an efficient manner.

Zijiang Yang, Dipendra Jha, Arindam Paul, Wei-keng Liao, Alok Choudhary, Ankit Agrawal

Modeling and sensitivity analysis of complex photovoltaic device processes is explored in this work. We use conditional variational autoencoders to learn the generative model and latent space of the process which is in turn used to predict the device performance. We further compute the Jacobian of the trained neural network to compute global sensitivity indices of the inputs in order to obtain an intuition and interpretation of the process. The results show the outperformance of generative models compared to predictive models for learning device processes. Furthermore, comparison of the results with sampling-based sensitivity analysis methods demonstrates the validity of our approach and the interpretability of the learned latent space.

Maryam Molamohammadi, Sahand Rezaei-Shoshtari, Nathaniel Quitoriano

A common pattern of progress in engineering has seen deep neural networks displacing human-designed logic. There are many advantages to this approach, divorcing decisionmaking from human oversight and intuition has costs as well. One is that deep neural networks can map similar inputs to very different outputs in a way that makes their application to safety-critical problem problematic.

We present a method to check that the decisions of a deep neural network are as intended by constructing the exact preimage of its predictions. Preimages generalize verification in the sense that they can be used to verify a wide class of properties, and answer much richer questions besides. We examine the functioning of an aircraft collision avoidance system, and show how exact preimages reduce undue conservatism when examining dynamic safety.

Our method iterates backwards through the layers of piecewise linear deep neural networks. Uniquely, we compute \emph{all} intermediate values that correspond to a prediction, propagating this calculation through layers using analytical formulae for layer preimages.

Kyle Matoba, François Fleuret

With the ever-increasing numbers in population and quality in healthcare, it is inevitable for the demand of energy and natural resources to rise. Therefore, it is important to design highly efficient and sustainable chemical processes in the pursuit of sustainability. The performance of a chemical plant is highly affected by its design and control. A design cannot be evaluated without its controls and vice versa. To optimally address design and control simultaneously, one must formulate a bi-level mixed-integer nonlinear program with a dynamic optimization problem as the inner problem; this, is intractable. However, by computing an optimal policy using reinforcement learning, a controller with close-form expression can be found and embedded into the mathematical program. In this work, an approach using a policy gradient method along with mathematical programming to solve the problem simultaneously is proposed. The approach was tested in two case studies and the performance of the controller was evaluated. It was shown that the proposed approach outperforms current state-of-the-art control strategies. This opens a whole new range of possibilities to address the simultaneous design and control of engineering systems.

Steven Sachio, Antonio del Rio Chanona, Panagiotis Petsagkourakis

The rising availability of large volume data has enabled a wide application of statistical Machine Learning (ML) algorithms in the domains of Cyber-Physical Systems (CPS), Internet of Things (IoT) and Smart Building Networks (SBN). This paper proposes a learning-based framework for sequentially applying the data-driven statistical methods to predict indoor temperature and yields an algorithm for controlling building heating system accordingly. This framework consists of a two-stage modelling effort: in the first stage, an univariate time series model (AR) was employed to predict ambient conditions; together with other control variables, they served as the input features for a second stage modelling where an multivariate ML model (XGBoost) was deployed. The models were trained with real world data from building sensor network measurements, and used to predict future temperature trajectories. Experimental results demonstrate the effectiveness of the modelling approach and control algorithm, and reveal the promising potential of the data-driven approach in smart building applications over traditional dynamics-based modelling methods. By making wise use of IoT sensory data and ML algorithms, this work contributes to efficient energy management and sustainability in smart buildings.

Yongchao Huang, Hugh Miles, Pengfei Zhang

We study the problem of optimizing expensive blackbox functions over combinatorial spaces (e.g., sets, sequences, trees, and graphs). BOCS is a state-of-the-art Bayesian optimization method for tractable statistical models, which performs semi-definite programming based acquisition function optimization (AFO) to select the next structure for evaluation. Unfortunately, BOCS scales poorly for large number of binary and/or categorical variables. Based on recent advances in submodular relaxation for solving Binary Quadratic Programs, we study an approach referred as Parametrized Submodular Relaxation (PSR) towards the goal of improving the scalability and accuracy of solving AFO problems for BOCS model. Experiments on diverse benchmark problems including real-world applications in communications engineering and electronic design automation show significant improvements with PSR for BOCS model.

Aryan Deshwal, Syrine Belakaria, Jana Doppa

Calibration of large-scale differential equation models to observational or experimental data is a widespread challenge throughout applied sciences and engineering. A crucial bottleneck in state-of-the art calibration methods is the calculation of local sensitivities, i.e. derivatives of the loss function with respect to the estimated parameters, which often necessitates several numerical solves of the underlying system of partial differential equations. In this paper, we present a new probabilistic approach which permits budget-constrained computations of local sensitivities, providing a quantification of uncertainty incurred in the sensitivities from this constraint. Moreover, information from previous sensitivity estimates can be recycled in subsequent computations, reducing the overall computational effort for iterative gradient-based calibration methods.

Jon Cockayne, Andrew Duncan

Significant progress has been made to obtain approximate solutions to PDEs using neural networks as a basis. One of these approaches (and the most popular and well-developed one) is the Physics Informed Neural Network (PINN). PINN has proved to provide promising results in various forward and inverse problems with great accuracy. However, PINN cannot be employed in its native form for solving problems where the PDE changes its form or when there is a discontinuity in the parameters of PDE across different sub-domains. Using separate PINNs for each sub-domain and connecting the corresponding solutions by interface conditions is a possible solution for this. However, this approach demands a high computational burden and memory usage. Here, we present a new method, Transfer Physics Informed Neural Network (TPINN), where one or more layer of PINN across different non overlapping sub-domains are changed keeping the other layers same for all the sub-domains. Solutions from different sub-domains are connected via problem specific interface conditions which are incorporated in to the loss function. We demonstrate the efficacy of TPINN through two heat transfer problems.

Sreehari Manikkan, Balaji Srinivasan

Sequential assembly with geometric primitives has drawn attention in robotics and 3D vision since it yields a practical blueprint to construct a target shape. However, due to its combinatorial property, a greedy method falls short of generating a sequence of volumetric primitives. To alleviate this consequence induced by a huge number of feasible combinations, we propose a combinatorial 3D shape generation framework. The proposed framework reflects an important aspect of human generation processes in real life -- we often create a 3D shape by sequentially assembling unit primitives with geometric constraints. To find the desired combination regarding combination evaluations, we adopt Bayesian optimization, which is able to exploit and explore efficiently the feasible regions constrained by the current primitive placements. An evaluation function conveys global structure guidance for an assembly process and stability in terms of gravity and external forces simultaneously. Experimental results demonstrate that our method successfully generates combinatorial 3D shapes and simulates more realistic generation processes. We also introduce a new dataset for combinatorial 3D shape generation.

Jungtaek Kim, Hyunsoo Chung, Jinhwi Lee, Minsu Cho, Jaesik Park

Deep neural networks (DNNs) have made a revolution in numerous fields during the last decade. However, in tasks with high safety requirements, such as medical or autonomous driving applications, providing an assessment of the model's reliability can be vital. Uncertainty estimation for DNNs has been addressed using Bayesian methods, providing mathematically founded models for reliability assessment. These model are computationally expensive and generally impractical for many real-time use cases. Recently, non-Bayesian methods were proposed to tackle uncertainty estimation more efficiently. We propose an efficient method for uncertainty estimation in DNNs achieving high accuracy. We simulate the notion of multi-task learning on single-task problems by producing parallel predictions from similar models differing by their loss. This multi-loss approach allows one-phase training for single-task learning with uncertainty estimation. We keep our inference time relatively low by leveraging the advantage proposed by the Deep Sub-Ensembles method. The novelty of this work resides in the proposed accurate variational inference with a simple and convenient training procedure, while remaining competitive in terms of computational time. We conduct experiments on SVHN, CIFAR10, CIFAR100 as well as ImageNet using different architectures. Our results show improved accuracy on the classification task and competitive results on several uncertainty measures.

Omer Achrack, Raizy Kellerman, Ouriel Barzilay

Assimilation of continuously streamed monitored data is an essential component of a digital twin. The assimilated data are then used to ensure the digital twin is a true representation of the monitored system; one way this is achieved is by calibration of simulation models, whether data-derived or physics-based. Traditional manual calibration is not time-efficient in this context; new methods are required for continuous calibration. In this paper, a particle filter methodology for continuous calibration of the physics-based model element of a digital twin is presented and applied to an example of an underground farm. The results are compared against static Bayesian calibration and are shown to give insight into the time variation of dynamically varying model parameters.

Rebecca Ward, Ruchi Choudhary, Alastair Gregory

Mesh-based simulations are central to modeling complex physical systems in many disciplines across science and engineering, as they support powerful numerical integration methods and their resolution can be adapted to strike favorable trade-offs between accuracy and efficiency. Here we introduce MeshGraphNets, a graph neural network-based method for learning simulations, which leverages mesh representations. Our model can be trained to pass messages on a mesh graph and to adapt the mesh discretization during forward simulation. We show that our method can accurately predict the dynamics of a wide range of physical systems, including aerodynamics, structural mechanics, and cloth-- and do so efficiently, running 1-2 orders of magnitude faster than the simulation on which it is trained. Our approach broadens the range of problems on which neural network simulators can operate and promises to improve the efficiency of complex, scientific modeling tasks.

Tobias Pfaff, Meire Fortunato, Alvaro Sanchez Gonzalez, Peter Battaglia

We discuss the implementation of a deep reinforcement learning based agent to automatically make scheduling decisions for a continuous chemical reactor currently in operation. This model is tasked with scheduling the reactor on a daily basis in the face of uncertain demand and production interruptions. The reinforcement learning model has been trained on a simulator of the scheduling process that was built with historical demand and production data. The model has been successfully implemented to develop schedules on-line for an industrial reactor and has exhibited improvements over human made schedules. We discuss the process of training, implementation, and development of this system and the application of reinforcement learning for complex, stochastic decision making in the chemical industry.

Christian Hubbs, Adam Kelloway, John Wassick, Nikolaos Sahinidis, Ignacio Grossmann

Modern design, control, and optimization often requires simulation of highly nonlinear models, leading to prohibitive computational costs. These costs can be amortized by evaluating a cheap surrogate of the full model. Here we present a general data-driven method, the continuous-time echo state network (CTESN), for generating surrogates of nonlinear ordinary differential equations with dynamics at widely separated timescales. We empirically demonstrate near-constant time performance using our CTESNs on a physically motivated scalable model of a heating system whose full execution time increases exponentially, while maintaining relative error of within 0.2 \%. We also show that our model captures fast transients as well as slow dynamics effectively, while other techniques such as physics informed neural networks have difficulties trying to train and predict the highly nonlinear behavior of these models.

Ranjan Anantharaman, Chris Rackauckas, Viral Shah

High-frequency resistance (HFR) is a critical quantity strongly related to a fuel cell system's performance. As such, an accurate and timely prediction of HFR is useful for understanding the system's operating status and the corresponding control strategy optimization. It is beneficial to estimate the fuel cell system's HFR from the measurable operating conditions without resorting to costly HFR measurement devices, the latter of which are difficult to implement at the real automotive scale. In this study, we propose a data-driven approach for a real-time prediction of HFR. Specifically, we use a long short-term memory (LSTM) based machine learning model that takes into account both the current and past states of the fuel cell, as characterized through a set of sensors. These sensor signals form the input to the LSTM. The data is experimentally collected from a vehicle lab that operates a 100 kW automotive fuel cell stack running on a automotive-scale test station. Our current results indicate that our prediction model achieves high accuracy HFR predictions and outperforms other frequently used regression models. We also study the effect of the extracted features generated by our LSTM model. Our study finds that even a simple LSTM based model can accurately predict HFR values.

Tong Lin

In this work, we present a learning based approach to analog circuit design, where the goal is to optimize circuit performance subject to certain design constraints. One of the aspects that makes this problem challenging to optimize, is that measuring the performance of candidate configurations with simulation can be computationally expensive, particularly in the post-layout design. Additionally, the large number of design constraints and the interaction between the relevant quantities makes the problem complex. Therefore, to better facilitate supporting the human designers, it is desirable to gain knowledge about the whole space of feasible solutions. In order to tackle these challenges, we take inspiration from model-based reinforcement learning and propose a method with two key properties. First, it learns a reward model, i.e., surrogate model of the performance approximated by neural networks, to reduce the required number of simulation. Second, it uses a stochastic policy generator to explore the diverse solution space satisfying constraints. Together we combine these in a Dyna-style optimization framework, which we call DynaOpt, and empirically evaluate the performance on a circuit benchmark of a two-stage operational amplifier. The results show that, compared to the model-free method applied with 20,000 circuit simulations to train the policy, DynaOpt achieves even much better performance by learning from scratch with only 500 simulations.

Wook Lee, Frans Oliehoek

The emergence of 3D printing technologies for stainless steel enables steel struc-tures with almost arbitrarily complex geometries to be manufactured. A common design preference for steel structures is that they arethin-walled, to reduce weight and limit the requirement for raw material. The mechanical properties of thin-walled structures are principally determined by their geometry; however, 3D-printed steel components exhibit geometric variation beyond that which was intended, due to the welding process involved, at a scale that is non-negligible with respect to the thickness of the wall. The cumulative impact of geometric variation is to alter the macro-scale mechanical properties of a printed component, such as deformation under load. An important challenge is therefore to predict the (random) macro-scale mechanical properties of a component, before it is manufactured. To address this, we trained a generative probabilistic model for rough surfaces defined on smooth manifolds to an experimentally-obtained dataset consisting of samples of 3D-printed steel. Combined with finite element simulation of components under load, we were able to produce detailed probabilistic predictions of the mechanical properties of a 3D-printed steel component. The main technical challenge was to transfer information from the training dataset to the hypothetical component, whose notional geometry may be described by a different manifold. Our proposed solution was to employ spatial random field models which can be characterised locally using a differential operator, and to leverage the correspondence between the Laplacian on the training and the test manifolds to facilitate the transfer of information.

Liam Fleming

Uncertainty Quantification using Markov Chain Monte Carlo (MCMC) can be prohibitively expensive for target probability densities with expensive likelihood functions, for instance when it involves solving a Partial Differential Equation (PDE), as is the case in a wide range of engineering applications. Multilevel Delayed Acceptance (MLDA) with an Adaptive Error Model (AEM) is a novel approach, which alleviates this problem by exploiting a hierarchy of models, with increasing complexity and cost, and correcting the inexpensive models on-the-fly. The method has been integrated with the open-source probabilistic programming package PyMC3 and is available in the latest development version. In this paper, we present the algorithm along with an illustrative example.

Mikkel Lykkegaard, Greg Mingas, Robert Scheichl, Colin Fox, Tim Dodwell

Designing a multi-layer optical system with designated optical characteristics is an inverse design problem in which the resulting design is determined by several discrete and continuous parameters. In particular, we consider three design parameters to describe a multi-layer stack: Each layer’s dielectric material and thickness as well as the total number of layers. Such a combination of both, discrete and continuous parameters is a challenging optimization problem that often requires a computationally expensive search for an optimal system design. Hence, most methods merely determine the optimal thicknesses of the system’s layers. To incorporate layer material and the total number of layers as well, we propose a method that considers the stacking of consecutive layers as parameterized actions in a Markov decision process. We propose an exponentially transformed reward signal that eases policy optimization and adapt a recent variant of Q-learning for inverse design optimization. We demonstrate that our method outperforms human experts and a naive reinforcement learning algorithm concerning the achieved optical characteristics. Moreover, the learned Q-values contain information about the optical properties of multi-layer optical systems, thereby allowing physical interpretation or what-if analysis.

Heribert Wankerl, Maike Stern, Ali Mahdavi, Christoph Eichler, Elmar Lang

Harnessing the magnetic field of the earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field data from the magnetometer on the navigation system encompass the magnetic field from not just the earth, but also from the vehicle on which it is mounted. It is difficult to separate the earth magnetic anomaly field magnitude, which is crucial for navigation, from the total magnetic field magnitude reading from the sensor. The purpose of this challenge problem is to decouple the earth and aircraft magnetic signals in order to derive a clean signal from which to perform magnetic navigation. Baseline testing on the dataset shows that the earth magnetic field can be extracted from the total magnetic field using machine learning (ML). The challenge is to remove the aircraft magnetic field from the total magnetic field using a trained neural network. These challenges offer an opportunity to construct an effective neural network for removing the aircraft magnetic field from the dataset, using an ML algorithm integrated with physics of magnetic navigation.

Albert Gnadt, Joseph Belarge, Aaron Canciani, Lauren Conger, Joseph Curro, Alan Edelman, Peter Morales, Mike O'Keeffe, Jonathan Taylor, Christopher Rackauckas

In this brief paper we introduce Bayesian polynomial chaos, a Gaussian process analogue to polynomial chaos. We argue why this Bayesian re-formulation of polynomial chaos is necessary and then proceed to mathematically define it, followed by an examination of its utility in computing moments and sensitivities; multi-fidelity modelling, and information fusion.

Pranay Seshadri, Andrew Duncan, Ashley Scillitoe

Lithium-Ion (Li-I) batteries have recently become pervasive and are used in many physical assets. To enable a good prediction of the end of discharge of batteries, detailed electrochemical Li-I battery models have been developed. Their parameters are typically calibrated before they are taken into operation and are typically not re-calibrated during operation. However, since battery performance is affected by aging, the reality gap between the computational battery models and the real physical systems leads to inaccurate predictions. A supervised machine learning algorithm would require an extensive representative training dataset mapping the observation to the ground truth calibration parameters. This may be infeasible for many practical applications. In this paper, we implement a Reinforcement Learning-based framework for reliably and efficiently inferring calibration parameters of battery models. The framework enables real-time inference of the computational model parameters in order to compensate the reality-gap from the observations. Most importantly, the proposed methodology does not need any labeled data samples, (samples of observations and the ground truth calibration parameters). Furthermore, the framework does not require any information on the underlying physical model.The experimental results demonstrate that the proposed methodology is capable of inferring the model parameters with high accuracy and high robustness. While the achieved results are comparable to those obtained with supervised machine learning, they do not rely on the ground truth information during training.

Ajaykumar Unagar, Yuan Tian, Olga Fink, Manuel Arias Chao

Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset is going to operate until a system failure occurs. Deploying successful RUL methods in real-life applications would result in a drastic change of perspective in the context of maintenance of industrial assets. In particular, the design of intelligent maintenance strategies capable of automatically establishing when interventions have to be performed has the potential of drastically reducing costs and machine downtimes. In light of their superior performances in a wide range of engineering fields, Machine Learning (ML) algorithms are natural candidates to tackle the challenges involved in the design of intelligent maintenance approaches. In particular, given the potentially catastrophic consequences associated with wrong maintenance decisions, it is desirable that ML algorithms provide uncertainty estimates alongside their predictions. In this work, we propose and compare a number of techniques based on Gaussian Processes (GPs) that can cope with this aspect. We apply these algorithms to the new C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset from NASA for aircraft engines. The results show that the proposed methods are able to provide very accurate RUL predictions along with sensible uncertainty estimates, resulting in more safely deployable solutions to real-life industrial applications.

Luca Biggio, Manuel Arias Chao, Olga Fink

The widespread adoption of robots will require a flexible and automated approach to robot design. Exploring the full space of all possible designs when creating a custom robot can prove to be computationally intractable, leading us to consider modular robots, composed of a common set of repeated components that can be reconfigured for each new task. But, conducting a combinatorial optimization process to create a specialized design for each new task and setting is computationally expensive, especially if the task changes frequently. In this work, our goal is to select mobile robot designs that will perform highest in a given environment under a known control policy, with the assumption that the selection process must be conducted for new environments frequently. We use deep reinforcement learning to create a neural network that, given a terrain map as an input, outputs the mobile robot designs deemed most likely to locomote successfully in that environment.

Julian Whitman, Matthew Travers, Howie Choset

Recently, deep reinforcement learning (DRL)-based approach has shown promise in solving complex decision and control problems in power engineering domain. In this paper, we present an in-depth analysis of DRL-based voltage control from aspects of algorithm selection, state space representation, and reward engineering. To resolve observed issues, we propose a novel imitation learning-based approach to directly map power grid operating points to effective actions without any interim reinforcement learning process. The performance results demonstrate that the proposed approach has strong generalization ability with much less training time. The agent trained by imitation learning is effective and robust to solve voltage control problem and outperforms the former RL agents.

Xiren Zhou, siqi wang, Ruisheng Diao, Desong Bian, Jiajun Duan, Di Shi

Physical design and production of integrated circuits (IC) is becoming increasingly more challenging as the sophistication in IC technology is steadily increasing. Placement has been one of the most critical steps in IC physical design. Through decades of research, partition-based, analytical-based, and annealing-based placers have been enriching the placement solution toolbox. However, open challenges including long run time and lack of the ability to generalize continue to restrict wider applications of existing placement tools. We devise a learning-based placement tool based on cyclic application of reinforcement learning (RL) and simulated annealing (SA) by leveraging the advancement of RL. Results show that the RL module is able to provide a better initialization for SA and thus leads to a better final placement design. Compared to other recent learning-based placers, our method is majorly different with its combination of RL and SA by leveraging the RL model’s ability to quickly get a good rough solution after training and the heuristics’ ability to realize greedy improvements in the solution.

Dhruv Vashisht, Harshik Rampal, Haiguang Liao, Yang Lu, Devika Shanbhag, Elias Fallon, Levent Burak Kara

Magnetically programmed soft structures with complex, fast, and reversible deformation capabilities are transforming various fields including soft robotics, wearable devices, and active metamaterials. While the encoded magnetization profile determines the shape-transformation of the magnetic soft structures, the current design methods are mainly limited to intuition-based trial and error process. In this work, a data-driven inverse design optimization approach for magnetically programmed soft structures is introduced to achieve complex shape-transformations. The proposed method is optimizing the design of the magnetization profile by utilizing a genetic algorithm relying on fitness and novelty function running cost-effectively in a simulation environment. Inverse design optimization of magnetization profiles for the quasi-static shape-transformation of 2D linear beams into 'M', 'P', and 'I' letter shapes are presented. 3D magnetization profile optimization enabled 3D deformation a rotating beam demonstration. The presented approach is also expanded to design of 3D magnetization profile for 3D shape-transformation of a linear beam rotating along its longitudinal axis. The data-driven inverse design approach established here paves the way for the automated design of magnetic soft structures with complex 3D shape-transformations.

Alp Karacakol, Yunus Alapan, Metin Sitti

Numerical simulations have revolutionized material design. However, although simulations excel at mapping an input material to its output property, their direct application to inverse design (i.e., mapping an input property to an optimal output material) has traditionally been limited by their high computing cost and lack of differentiability—so that simulations are often replaced by surrogate machine learning models in inverse design problems. Here, taking the example of the inverse design of a porous matrix featuring targeted sorption isotherm, we introduce a computational inverse design framework that addresses these challenges. We reformulate a lattice density functional theory of sorption as a differentiable simulation programmed on TensorFlow platform that leverages automated end-to-end differentiation. Thanks to its differentiability, the simulation is used to directly train a deep generative model, which outputs an optimal porous matrix based on an arbitrary input sorption isotherm curve. Importantly, this inverse design pipeline leverages for the first time the power of tensor processing units (TPU)—an emerging family of dedicated chips, which, although they are specialized in deep learning, are flexible enough for intensive scientific simulations. This approach holds promise to accelerate inverse materials design.

HAN LIU, Yuhan Liu, Zhangji Zhao, Sam Schoenholz, Dogus Cubuk, Mathieu Bauchy

While additive manufacturing has seen rapid proliferation in recent years, process monitoring and quality assurance methods capable of detecting micro-scale flaws have seen little improvement and remain largely expensive and time-consuming. In this work we propose a pipeline for training two deep learning flaw formation detection techniques including convolutional neural networks and long short-term memory networks. We demonstrate that the flaw formation mechanisms of interest to this study, including keyhole porosity, lack of fusion, and bead up, are separable using these methods. Both approaches have yielded a classification accuracy over 99% on unseen test sets. The results suggest that the implementation of machine learning enabled acoustic process monitoring is potentially a viable replacement for traditional quality assurance methods as well as a tool to guide traditional quality assurance methods.

Wentai Zhang, Levent Burak Kara

The design of complex engineering systems leads to solving very large optimization problems involving different disciplines. Strategies allowing disciplines to optimize in parallel by providing sub-objectives and splitting the problem into smaller parts, such as Collaborative Optimization, are promising solutions. However, most of them have slow convergence which reduces their practical use. Earlier efforts to fasten convergence by learning surrogate models have not yet succeeded at sufficiently improving the competitiveness of these strategies. This paper shows that, in the case of Collaborative Optimization, faster and more reliable convergence can be obtained by solving an interesting instance of binary classification: on top of the target label, the training data of one of the two classes contains the distance to the decision boundary and its derivative. Leveraging this information, we propose to train a neural network with an asymmetric loss function, a structure that guarantees Lipshitz continuity, and a regularization towards respecting basic distance function properties. The approach is demonstrated on a toy learning example, and then applied to a multidisciplinary aircraft design problem.

Jean de Becdelievre, Ilan Kroo

The field of DNA nanotechnology has made it possible to assemble, with high yields, different structures that have actionable properties. For example, researchers have created components that can be actuated, used to sense (e.g., changes in pH), or to store and release loads. An exciting next step is to combine these components into multifunctional nanorobots that could, potentially, perform complex tasks like swimming to a target location in the human body, detecting an adverse reaction and then releasing a drug load to stop it. However, as we start to assemble more complex nanorobots, the yield of the desired nanorobot begins to decrease as the number of possible component combinations increases. Therefore, the ultimate goal of this work is to develop a predictive model to maximize yield. However, training predictive models typically requires a large dataset. For the nanorobots we are interested in assembling, this will be difficult to collect. This is because high-fidelity data, which allows us to exactly characterize the shape and size of individual structures, is extremely time-consuming to collect, whereas low-fidelity data is readily available but only captures overall statistics for different processes. Therefore, this work combines low- and high-fidelity data to train a generative model using a two-step process. First, we pretrain the model using a relatively small (1000s), high-fidelity dataset to represent the distribution of nanorobot shapes. Second, we bias the learned distribution towards samples with certain physical properties that are measured using low-fidelity data. In this work we bias our distribution towards a desired node degree of a graphical model that we take as a surrogate representation of the nanorobots that this work will ultimately focus on. We have not yet accumulated a high-fidelity dataset of nanorobots, so we leverage the MolGAN architecture [1] and the QM9 small molecule dataset [2-3] to demonstrate our approach.

Emma Benjaminson, Rebecca Taylor, Matthew Travers

Traditional linear subspace reduced order models (LS-ROMs) are able to accelerate physical simulations, in which the intrinsic solution space falls into a subspace with a small dimension, i.e., the solution space has a small Kolmogorov n-width. However, for physical phenomena not of this type, such as advection-dominated flow phenomena, a low-dimensional linear subspace poorly approximates the solution. To address cases such as these, we have developed an efficient nonlinear manifold ROM (NM-ROM), which can better approximate high-fidelity model solutions with a smaller latent space dimension than the LS-ROMs. Our method takes advantage of the existing numerical methods that are used to solve the corresponding full order models (FOMs). The efficiency is achieved by developing a hyper-reduction technique in the context of the NM-ROM. Numerical results show that neural networks can learn a more efficient latent space representation on advection-dominated data from 2D Burgers' equations with a high Reynolds number. A speed-up of up to 11.7 for 2D Burgers' equations is achieved with an appropriate treatment of the nonlinear terms through a hyper-reduction technique.

Youngkyu Kim, Youngsoo Choi, David Widemann, Tarek Zohdi

Many problems in engineering and design require balancing competing objectives under the presence of uncertainty. The standard approach in the literature characterizes the relationship between design decisions and their corresponding outcomes as a Pareto frontier, which is discovered through multiobjective optimization. In this position paper, we suggest that this approach is not ideal for reasoning about practical design decisions. Instead of multiobjective optimization, we propose soliciting desired minimum performance constraints on all objectives to define regions of satisfactory. We present work-in-progress which visualizes the design decisions that consistently satisfy user-defined thresholds in an additive manufacturing problem.

Gustavo Malkomes, Harvey Cheng, Michael McCourt

Heat pattern of cities is characterized by its higher temperature than the surrounding environments, and cities are vulnerable places to heat-induced risk because of its dense population. Therefore, fast/accurate heat risk assessment is desired for mitigation plans and sustainable community management. This paper introduces a probabilistic model to forecast the meso-scale surface temperature at a relatively low computational cost, as an alternative to computationally intensive Numerical Weather Prediction (NWP) models. After calibrating the model, we integrate the model into the probabilistic risk analysis framework to estimate extreme temperature distribution around the cities. The surrogate model expands its applicability, providing insights on the future risk and various statistical inferences, being integrated with the framework.

Byeongseong Choi, Matteo Pozzi, Mario Berges

Pipeline integrity is an important area of concern for the oil and gas, refining, chemical, hydrogen, carbon sequestration, and electric-power industries, due to the safety risks associated with pipeline failures. Regular monitoring, inspection, and maintenance of these facilities is therefore required for safe operation. Large stand-off magnetometry (LSM) is a non-intrusive, passive magnetometer-based measurement technology that has shown promise in detecting defects (anomalies) in regions of elevated mechanical stresses. However, analyzing the noisy multi-sensor LSM data to clearly identify regions of anomalies is a significant challenge. This is mainly due to the high frequency of the data collection, mis-alignment between consecutive inspections and sensors, as well as the number of sensor measurements recorded. In this paper we present LSM defect identification approach based on machine learning (ML). We show that this ML approach is able to successfully detect anomalous readings using a series of methods with increasing model complexity and capacity. The methods start from unsupervised learning with "point" methods and eventually increase complexity to supervised learning with sequence methods and multi-output predictions. We observe data leakage issues for some methods with randomized train/test splitting and resolve them by specific non-randomized splitting of training and validation data. We also achieve a 200x acceleration of support-vector classifier (SVC) method by porting computations from CPU to GPU leveraging the cuML RAPIDS AI library. For sequence methods, we develop a customized Convolutional Neural Network (CNN) architecture based on 1D convolutional filters to identify and characterize multiple properties of these defects. In the end, we report scalability of the best-performing methods and compare them, for viability in field trials.

Peetak Mitra, Denis Akhiyarov, Mauricio Araya-Polo, Daniel Byrd

The manufacturing industry is one of the largest industries in the world, vitally supporting the economies of many countries across the globe. With the growing deployability of artificial intelligence (AI), manufacturers are turning to AI to turn their production plants into more efficient smart factories. Smart factories have contributed towards improving worker safety and their high efficiency means that they can deliver quality products faster to their customers. As the manufacturing industry embraces machine learning, demand for user-friendly tools that can deploy complex machine learning models with relative ease for engineering professionals has been growing over the years. In particular, deep learning tools need a considerable amount of programming knowledge and, thus, remain obscure to engineers inexperienced with programming. To overcome these barriers, we propose ManufacturingNet, an open-source machine learning tool for engineers which will enable them to develop and deploy complex machine learning models by answering a few simple questions. We also have curated ten publicly-available datasets and benchmarked the performance using ManufacturingNet‘s machine learning models. We obtained state-of-the-art results for each dataset and have included pre-trained models with our package. We believe ManufacturingNet will enable engineers around the world to deploy machine learning models with ease. The GitHub repository for ManufacturingNet can be found at https://github.com/BaratiLab/ManufacturingNet. Keywords: Manufacturing, Deep Learning, Programming, ManufacturingNet

Rishikesh Magar, Lalit Ghule , Ruchit Doshi, Sharan Seshadri , Aman Khalid, Amir Barati Farimani

Security-constrained optimal power flow (SCOPF) is a critical problem for the operation of power systems, aiming to schedule power generation in a way that is robust to potential equipment failures. However, many SCOPF approaches require constructing large optimization problems that explicitly account for each of these potential system failures, thus suffering from issues of computational complexity that limit their use in practice. In this paper, we propose an approach to solving SCOPF inspired by adversarially robust training in neural networks. In particular, we frame SCOPF as a bi-level optimization problem -- viewing power generation settings as parameters associated with a neural network defender, and equipment failures as (adversarial) attacks -- and solve this problem via gradient-based techniques. We describe the results of initial experiments on a 30-bus test system.

Neeraj Vijay Bedmutha, Priya Donti, J. Zico Kolter

Imaging modalities provide clinicians with real-time visualization of anatomical regions of interest (ROI) for the purpose of minimally invasive surgery. During the procedure, low-resolution image data are acquired and registered with high-resolution preoperative 3D reconstruction to guide the execution of surgical preplan. Unfortunately, due to the potential large strain and nonlinearities in the deformation of soft biological tissues, significant mismatch may be observed between ROI shapes during pre- and intra-operative imaging stages, making the surgical preplan prone to failure. In an effort to bridge the gap between the two imaging stages, this paper presents a data-driven approach based on artificial neural network for predicting the ROI deformation in real time with sparsely registered fiducial markers. For a head-and-neck tumor model with an average maximum displacement of 30 mm, the maximum surface offsets between benchmarks and predictions using the proposed approach for 98% of the test cases are under 1.0 mm, which is the typical resolution of high-quality interventional ultrasound. Each of the prediction processes takes less than 0.5 s. With the resulting prediction accuracy and computational efficiency, the proposed approach demonstrates its potential to be clinically relevant.

Haolin Liu, Ye Han, Daniel Emerson, Houriyeh Majditehran, Yoed Rabin, Levent Burak Kara

Deep learning and machine learning have recently attracted remarkable attention in the inverse design of nanostructures. However, limited works have used these techniques to reduce the design complexity of structures. In this work, we present an evolutionary-based method using manifold learning for inverse design of nanostructures with minimal design complexity. This method encodes the high dimensional spectral responses obtained by electromagnetic simulation software for a class of nanostructure with different design complexities using an autoencoder (AE). We model the governing distributions of the data in the latent space using Gaussian mixture models (GMM) which then provides the level of feasibility of a desired response for each structure and use a neural network (NN) to find the optimum solution. This method also provides valuable information about the underlying physics of light-matter interactions by representing the sub-manifolds of feasible regions for each design complexity level (i.e., number of design parameters) in the latent space. To show the applicability of the method, we employ this technique for inverse design of a class of nanostructures consisting of dielectric metasurfaces with different complexity degrees.

Mohammadreza Zandehshahvar, Yashar Kiarashinejad, Muliang Zhu, Hossein Maleki, Omid Hemmatyar, Sajjad Abdollahramezani, Reza Pourabolghasem, Ali Adibi

Incompressible fluid flow around a cylinder is one of the classical problems in fluid-dynamics with strong relevance with many real-world engineering problems, for example, design of offshore structures or design of a pin-fin heat exchanger. Thus learning a high-accuracy surrogate for this problem can demonstrate the efficacy of a novel machine learning approach. In this work, we propose a physics-informed neural network (PINN) architecture for learning the relationship between simulation output and the underlying geometry and boundary conditions. In addition to using a physics-based regularization term, the proposed approach also exploits the underlying physics to learn a set of Fourier features, i.e. frequency and phase offset parameters, and then use them for predicting flow velocity and pressure over the spatio-temporal domain. We demonstrate this approach by predicting simulation results over out of range time interval and for novel design conditions. Our results show that incorporation of Fourier features improves the generalization performance over both temporal domain and design space.

Tongtao Zhang, Biswadip Dey, Pratik Kakkar, Arindam Dasgupta, Amit Chakraborty

Two-dimensional nanomaterials, such as graphene, have been extensively studied because of their outstanding physical properties. Structure and geometry optimization of nanopores on such materials is beneficial for their performance in real-world engineering applications such as water desalination. However, the optimization process often involves very large numbers of experiments or simulations which are expensive and time-consuming. In this work, we propose a graphene nanopore optimization framework via the combination of deep reinforcement learning (DRL) and convolutional neural network (CNN) for efficient water desalination. The DRL agent controls the geometry of nanopore, while the CNN is employed to predict the water flux and ion rejection of the nanoporous graphene membrane at a certain external pressure. With the CNN-accelerated property prediction, our DRL agent can optimize the nanoporous graphene efficiently in an online manner. Experiments show that our framework can design nanopore structures that are promising in energy-efficient water desalination.

Yuyang Wang, Zhonglin Cao, Amir Barati Farimani

Generative models are now used to create a variety of high-quality digital artifacts. Yet their use in designing physical objects has received far less attention. In this paper, we argue for the building toy LEGO as a platform for developing generative models of sequential assembly. We develop a generative model based on graph-structured neural networks that can learn from human-built structures and produce visually compelling designs.

Rylee Thompson, Graham Taylor, Terrance DeVries, Elahe Ghalebi

Many real-world applications involve black-box optimization of multiple objectives using continuous function approximations that trade-off accuracy and resource cost of evaluation. For example, in rocket launching research, we need to find designs that trade-off return-time and angular distance using continuous-fidelity simulators (e.g., varying tolerance parameter to trade-off simulation time and accuracy) for design evaluations. The goal is to approximate the optimal Pareto set by minimizing the cost for evaluations. In this paper, we propose a novel approach referred to as {\em {\bf i}nformation-Theoretic {\bf M}ulti-Objective Bayesian {\bf O}ptimization with {\bf C}ontinuous {\bf A}pproximations (iMOCA)} to solve this problem. The key idea is to select the sequence of input and function approximations for multiple objectives which maximize the information gain per unit cost for the optimal Pareto front. Our experiments on diverse synthetic and real-world benchmarks show that iMOCA significantly improves over existing single-fidelity methods.

Syrine Belakaria, Aryan Deshwal, Jana Doppa

Planning future operational scenarios of bulk power systems that meet security and economic constraints typically requires intensive labor efforts in performing massive simulations. To automate this process and relieve engineers' burden, a novel multi-stage approach is presented in this paper to train centralized and decentralized reinforcement learning agents that can automatically adjust grid controllers for regulating transmission line flows at normal condition and under contingencies. The power grid flow control problem is formulated as Markov Decision Process (MDP). At Stage 1, centralized soft actor-critic (SAC) agent is trained to control generator active power outputs in a wide area to control transmission line flows against specified security limits. If line overloading issues remain unresolved, Stage 2 is used that train decentralized SAC agents via load throw-over at local substations. The effectiveness of the proposed approach is verified on a series of actual planning cases used for operating the power grid of SGCC Zhejiang Electric Power Company.

Xiumin Shang, Jingping Yang, Bingquan Zhu, Lin Ye, Jing Zhang, Jianping Xu, Qin Lyu, Ruisheng Diao

Near-term prediction of the structured spatio-temporal processes driving our climate is of profound importance to the safety and well-being of millions, but the prounced nonlinear convection of these processes make a complete mechanistic description even of the short-term dynamics challenging. However, convective transport provides not only a principled physical description of the problem, but is also indicative of the transport in time of informative features which has lead to the recent successful development of ``physics free'' approaches. In this work we demonstrate that their remains an important role to be played by physically informed models, which can successfully leverage deep learning (DL) to project the process onto a lower dimensional space on which a minimal dynamical description holds. Our approach synthesises the feature extraction capabilities of DL with physically motivated dynamics to outperform existing model free approaches, as well as state of the art hybrid approaches, on complex real world datasets including sea surface temperature and precipitation.

Daniel Tait

Reducing the carbon footprint in cement production is a pressing challenge faced by the construction industry. In the past few years, the world annual cement consumption is approximately at 4 billion tons, where each ton leads to 1-ton CO2 emissions. To curb the massive environmental impact, it is pertinent to improve material performance and reduce carbon embodiment of cement. This requires an in-depth understanding of how cement strength is controlled by its chemical composition. Although this problem has been investigated for more than one hundred years, our current knowledge is still deficient for a clear decomposition of this complex composition-strength relationship. Here, we take advantage of Gaussian process regression (GPR) to decipher the fundamental compositional attributes (the cement "genome") to cement strength performance. Among all machine learning methods applied to the same dataset, our GPR model achieves the highest accuracy of predicting cement strength based on the chemical compounds. Based on the optimized GPR model, we are able to decompose the influence of each oxide on cement strength to an unprecedented level.

Yu Song, Yongzhe Wang, Kaixin Wang, Mathieu Bauchy

Buildings produce more U.S. greenhouse gas emissions through electricity generation than any other economic sector. To improve the energy efficiency of buildings, engineers often rely on physics-based building simulations to predict the impacts of retrofits in individual buildings. In dense urban areas, these models suffer from inaccuracy due to imprecise parameterization or external, unmodeled urban context factors such as inter-building effects and urban microclimates. In a case study of approximately 30 buildings in Sacramento, California, we demonstrate how our hybrid physics-driven deep learning framework can use these external factors advantageously to identify a more optimal energy efficiency retrofit installation strategy and achieve significant savings in both energy and cost.

Benjamin Choi, Alex Nutkiewicz, Rishee Jain

Local-gradient-based optimization approaches lack nonlocal exploration ability required for escaping from local minima when searching non-convex landscapes. A directional Gaussian smoothing (DGS) approach was recently proposed in \cite{2020arXiv200203001Z} and used to define a truly nonlocal gradient, referred to as the DGS gradient, in order to enable nonlocal exploration in high-dimensional black-box optimization. Promising results show that replacing the traditional local gradient with the nonlocal DGS gradient can significantly improve the performance of gradient-based methods in optimizing highly multi-modal loss functions. However, the current DGS method is designed for unbounded and uncontrained optimization problems, making it inapplicable to real-world engineering optimization problems where the tuning parameters are often bounded and the loss function is usually constrained by physical processes. In this work, we propose to extend to the DGS approach to the constrained inverse design framework in order to find better optima of multi-modal loss functions. A series of adaptive strategies for smoothing radius and learning rate updating are developed to improve the computational efficiency and robustness. Our methodology is demonstrated by an example of designing a nanoscale wavelength demultiplexer, and shows superior performance compared to the state-of-the-art approaches. By incorporating volume constraints, the optimized design achieves an equivalently high performance but significantly reduces the amount of material usage.

Sirui Bi, Jiaxin Zhang, Guannan Zhang

We describe an approach to learning optimal control policies for a large, linear particle accelerator that uses a powerful AI-based approach using deep reinforcement learning coupled with a high-fidelity physics engine. The framework consists of an AI controller that uses deep neural nets for state and action-space representation and learns optimal policies using reward signals that are provided by the physics simulator. For this work, we only focus on controlling a small section of the entire accelerator. Nevertheless, initial results indicate that we can achieve better-than-human level performance in terms of particle beam current and distribution. The ultimate goal of this line of wok is to substantially reduce the tuning time for such facilities by orders of magnitude, and achieve near-autonomous control.

Xiaoying Pang, Sunil Thulasidasan, Larry Rybarcyk

Engineering applications typically require a mathematical reduction of complex physical model to a more simplistic representation, unfortunately this simplification typically leads to a missing physics problem. In this work we introduce a state space solution to recovering the hidden physics by sharing information between different operating scenarios, referred to as ``tasks''. We introduce an approximation that ensures the resulting model scales linearly in the number of tasks, and provide theoretical guarantees that this solution will exist for sufficiently small time-steps. Finally we demonstrate how this framework may be used to improve the prediction of Lithium-ion concentration in electric batteries.

Daniel Tait, Ferran Brosa Planella, Widanalage Dhammika Widanage, Theo Damoulas

The adoption of Machine Learning (ML) for building emulators for complex physical processes has seen an exponential rise in the recent years. While ML models are good function approximators, optimizing the hyper-parameters of the model to reach a global minimum is not trivial, and often needs human knowledge and expertise. In this light, automatic ML or autoML methods have gained large interest as they automate the process of network hyper-parameter tuning. In addition, Neural Architecture Search (NAS) has shown promising outcomes for improving model performance. While autoML methods have grown in popularity for image, text and other applications, their effectiveness for high-dimensional, complex scientific datasets remains to be investigated. In this work, a data driven emulator for turbulence closure terms in the context of Large Eddy Simulation (LES) models is trained using Artificial Neural Networks and an autoML framework based on Bayesian Optimization, incorporating priors to jointly optimize the hyper-parameters as well as conduct a full neural network architecture search to converge to a global minima, is proposed. Additionally the effect of using different network weight initialization and optimizers such as ADAM, SGDM and RMSProp, are explored. Weight and function space similarities during the optimization trajectory are investigated, and critical differences in the learning process evolution are noted and compared to theory. We observe ADAM optimizer and Glorot initialization consistently performs better, while RMSProp outperforms SGDM as the latter appears to have been stuck at a local optima. Therefore, this autoML BayesOpt framework provides a means to choose the best hyper-parameter settings for a given dataset.

Peetak Mitra, Niccolo Dal Santo, Majid Haghshenas, Shounak Mitra, Conor Daly, David Schmidt

Robots are increasingly pervasive in manufacturing. However, robotic grippers are often still very simple parallel-jaw grippers with flat fingers, which are very sub-optimal for many objects. Having engineers design a new gripper for every object is a very expensive and inefficient process. We instead propose to automatically design them using machine learning. First, we use Evolutionary Strategies in simulation to get a good initial gripper. We also propose an automatic curriculum design that automatically increases the difficulty of the design task in simulation to ease the design process. Once the gripper is designed in simulation we fine-tune it via back-propagation on a Graph Neural Network model trained on real data for many grippers and objects. By amortizing real-world data across grippers and objects we can be very data-efficient in the real world, leveraging prior experience in a manner analogous to that of meta-learning. We show that our method improves the default gripper by significant margins on multiple datasets of varied objects.

Ferran Alet, Maria Bauza, Adarsh K Jeewajee, Max Thomsen, Alberto Rodriguez, Leslie Kaelbling, Tomás Lozano-Pérez

Nondestructive testing (NDT) is widely applied to defect identification of turbine components during manufacturing and operation. Operational efficiency is key for gas turbine OEM (Original Equipment Manufacturers). Automating the inspection process as much as possible, while minimizing the uncertainties involved, is thus crucial. We propose a model based on RetinaNet to identify drilling defects in X-ray images of turbine blades. The application is challenging due to the large image resolutions in which defects are very small and hardly captured by the commonly used anchor sizes, and also due to the small size of the available dataset. As a matter of fact, all these issues are pretty common in the application of Deep Learning-based object detection models to industrial defect data. We overcome such issues using open source models, splitting the input images into tiles and scaling them up, applying heavy data augmentation, and optimizing the anchor size and aspect ratios with a differential evolution solver. We validate the model with 3-fold cross-validation, showing a very high accuracy in identifying images with defects. We also define a set of best practices which can help other practitioners overcome similar challenges.

Andrea Panizza, Szymon Tomasz Stefanek, Stefano Melacci, giacomo Veneri, Marco Gori

Topology optimization (TO) is a popular and powerful computational approach for designing novel structures, materials, and devices. Two computational challenges have limited the applicability of TO to a variety of industrial applications. First, a TO problem often involves a large number of design variables to guarantee sufficient expressive power. Second, many TO problems require a large number of expensive physical model simulations, and those simulations cannot be parallelized. To address these issues, we propose a general scalable deep-learning (DL) based TO framework, referred to as SDL-TO, which utilizes parallel CPU+GPU schemes to accelerate the TO process for designing additively manufactured (AM) materials. Unlike the existing studies of DL for TO, our framework accelerates TO by learning the iterative history data and simultaneously training on the mapping between the given design and its gradient. The surrogate gradient is learned by utilizing parallel computing on multi-CPUs incorporated with distributed DL training on multi-GPUs. The surrogate gradient enables a fast online update scheme instead of an expensive update. Using a local sampling strategy, we achieve to reduce the intrinsic high dimensionality of design space and improve the training accuracy and the scalability of the SDL-TO framework. The method is demonstrated by benchmark examples and AM materials design for heat conduction, and shows competitive performance compared to the baseline methods but significantly reduce the computational cost by a speed up of 8.6x over standard TO implementation.

Sirui Bi, Jiaxin Zhang, Guannan Zhang

Author Information

Alex Beatson (Princeton University)
Priya Donti (Carnegie Mellon University)
Amira Abdel-Rahman (MIT)
Stephan Hoyer (Google)
Rose Yu (University of California, San Diego)
J. Zico Kolter (Carnegie Mellon University / Bosch Center for AI)

Zico Kolter is an Assistant Professor in the School of Computer Science at Carnegie Mellon University, and also serves as Chief Scientist of AI Research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, explainable, and rigorous methods in deep learning. In addition, he has worked on a number of application areas, highlighted by work on sustainability and smart energy systems. He is the recipient of the DARPA Young Faculty Award, and best paper awards at KDD, IJCAI, and PESGM.

Ryan Adams (Princeton University)

More from the Same Authors