Timezone: »

Program Transformations for ML
Pascal Lamblin · Atilim Gunes Baydin · Alexander Wiltschko · Bart van Merriënboer · Emily Fertig · Barak Pearlmutter · David Duvenaud · Laurent Hascoet

Sat Dec 14 08:00 AM -- 06:00 PM (PST) @ West 114 + 115
Event URL: https://program-transformations.github.io/ »

Machine learning researchers often express complex models as a program, relying on program transformations to add functionality. New languages and transformations (e.g., TorchScript and TensorFlow AutoGraph) are becoming core capabilities of ML libraries. However, existing transformations, such as automatic differentiation (AD), inference in probabilistic programming languages (PPL), and optimizing compilers are often built in isolation, and limited in scope. This workshop aims at viewing program transformations in ML in a unified light, making these capabilities more accessible, and building entirely new ones.
Program transformations are an area of active study. AD transforms a program performing numerical computation into one computing the gradient of those computations. In PPL, a program describing a sampling procedure can be modified to perform inference on model parameters given observations. Other examples are vectorizing a program expressed on one data point, and learned transformations where ML models use programs as inputs or outputs.
This workshop will bring together researchers in the fields of AD, programming languages, compilers, and ML, with the goal of understanding the commonalities between disparate approaches and views, and sharing ways to make these techniques broadly available. It would enable ML practitioners to iterate faster on novel models and architectures (e.g., those naturally expressed through high-level constructs like recursion).
—Abstractions and syntax (beyond meta-programming and operator overloading) to naturally express a program (expression, or procedure) as an object to be manipulated.
—Techniques from AD and PPL the ML community could adopt to enable research on new models
—How to overcome challenges due to the ML’s specific hardware (GPUs, specialized chips) and software (Python) stacks, and the particular demands of practitioners for their tools
—Greater collaboration between ML and programming languages communities

Sat 8:30 a.m. - 8:40 a.m.
Opening statements (Introduction)
Sat 8:40 a.m. - 9:30 a.m.

Deep learning and probabilistic programming are domains that have a lot in common in certain respects; both rely on software abstractions to enable iterative model development.

In this talk we discuss how we can integrate techniques from both domains in problems where we would like to use priors to induce structured representations. To do so, we employ reweighted wake-sleep methods, which combine importance sampling methods (which have been operationalized in probabilistic programming) with variational methods for learning proposals.

To enable a more iterative design of these methods, we introduce compositional constructs, which we refer to as combinators, which serve to define both model structure and evaluation strategies that correspond to different importance sampling schemes. Together these constructs define a path towards a more compositional design of variational methods that are correct by construction.

Jan-Willem van de Meent
Sat 9:30 a.m. - 9:50 a.m.
Applications of a disintegration transformation (Talk)
Praveen Narayanan
Sat 9:50 a.m. - 10:30 a.m.
Coffee break (Break)
Sat 10:30 a.m. - 11:20 a.m.

Probabilities are extensively used in Computer Science. Algorithms use probabilistic choices for improving efficiency or even for tackling problems that are unsolvable with deterministic computing. Recently, (Functional) Probabilistic Programming has been introduced for applications in Machine Learning and Artificial Intelligence. Probabilistic programs are used to describe statistical models and for developing probabilistic data analysis.

In Probabilistic Programming Languages, inference algorithms are often delegated to compilers including optimizations. This program transformations are error prone, yet they should not change the probabilistic models. Hence the need for formal methods to avoid bugs. Developing formal semantics for probabilistic computing is challenging but crucial in order to systematize the analysis and certification of probabilistic programs.

In this talk, I will first introduce functional probabilistic programing and the related problems. Then, I will present recent works in semantics of probabilistic computing, based on approximation of programs according to their use of resources.

Christine Tasson
Sat 11:20 a.m. - 11:40 a.m.
The Differentiable Curry (Talk)
Dimitrios Vytiniotis
Sat 11:40 a.m. - 12:00 p.m.
Functional Tensors for Probabilistic Programming (Talk)
Fritz Obermeyer
Sat 12:00 p.m. - 2:00 p.m.
Lunch break & Poster session (Poster Session)
Breandan Considine, Mike Innes, Du Phan, Dougal Maclaurin, Robin Manhaeve, Alexey Radul, Shashi Gowda, Ekansh Sharma, Eli Sennesh, Maxim K Kochurov, Gordon Plotkin, Thomas Wiecki, Navjot Kukreja, Chung-chieh Shan, Matthew Johnson, Dan Belov, Neeraj Pradhan, Wannes Meert, Angelika Kimmig, Luc De Raedt, Brian Patton, Matthew Hoffman, Rif A. Saurous, Dan Roy, Eli Bingham, Martin Jankowiak, Colin Carroll, Junpeng Lao, Liam Paull, Martin Abadi, Angel Rojas Jimenez, JP Chen
Sat 2:00 p.m. - 2:50 p.m.
Optimized execution of PyTorch programs with TorchScript (Talk)
Zachary DeVito
Sat 2:50 p.m. - 3:40 p.m.

JAX is a system for high-performance machine learning research. It offers the familiarity of Python+NumPy together with hardware acceleration, and it enables the definition and composition of user-wielded function transformations useful for machine learning programs. These transformations include automatic differentiation, automatic batching, end-to-end compilation (via XLA), parallelizing over multiple accelerators, and more. Composing these transformations is the key to JAX's power and simplicity.

Skye Wanderman-Milne
Sat 3:40 p.m. - 4:20 p.m.
Coffee break (Break)
Sat 4:20 p.m. - 4:40 p.m.
Generalized Abs-Linear Learning (Talk)
Andreas Griewank
Sat 4:40 p.m. - 5:00 p.m.
Towards Polyhedral Automatic Differentiation (Talk)
Jan Hueckelheim
Sat 5:00 p.m. - 5:20 p.m.
Taylor-Mode Automatic Differentiation for Higher-Order Derivatives in JAX (Talk)
Jesse Bettencourt
Sat 5:20 p.m. - 6:00 p.m.
Panel and general discussion (Panel Discussion)

Author Information

Pascal Lamblin (Google)
Atilim Gunes Baydin (University of Oxford)
Alexander Wiltschko (Google Brain)
Bart van Merriënboer (Google)
Emily Fertig (Google Research)
Barak Pearlmutter (Maynooth University)
David Duvenaud (University of Toronto)

David Duvenaud is an assistant professor in computer science at the University of Toronto. His research focuses on continuous-time models, latent-variable models, and deep learning. His postdoc was done at Harvard University, and his Ph.D. at the University of Cambridge. David also co-founded Invenia, an energy forecasting and trading company.

Laurent Hascoet (INRIA)

More from the Same Authors