Track: Tutorial Track 2C

Session

Tutorial Track 2C

Abstract:

Chat is not available.

Mon 7 Dec. 11:00 - 13:30 PST

(Track2) Machine Learning for Astrophysics and Astrophysics Problems for Machine Learning

David W Hogg · Kate Storey-Fisher

The field of astrophysics has been an avid consumer—and also a developer—of new methods in data science (maybe even dating back to the invention of Bayesian inference). With constantly growing data volumes, increasingly complex and costly physical models, and demand for extremely precise measurements, astrophysics presents opportunities for innovation in machine learning (ML) methods.

In this tutorial, we will give a sense of the myriad connections between astrophysics and ML, and demonstrate that astrophysics is an ideal sandbox for developing and testing ML applications and innovations. We will also discuss areas where vanilla ML methods fail or require extension or elaboration to be competitive with traditional astronomy techniques.

Astronomical data falls into four broad types: imaging, spectroscopy, time series, and catalogs. We will discuss the scientific understandings and precise measurements that we hope to obtain from these data sets, the challenges specific to each of them, and the successes and opportunities for ML applications in these domains. We will demonstrate how to obtain and start working with current leading-edge public data sets of each type. Participants should expect to do hands-on work with the data during the tutorial (we’ll demo with Python and Jupyter, but any platform can play). By the end, we hope that participants will be able to download, visualize, and apply ML algorithms to astronomical data, in ways relevant to current research directions in astrophysics. DWH and KSF thank the members of the Astronomical Data Group at the Flatiron Institute for support with the ideas, code, and content in this tutorial.

Mon 7 Dec. 13:30 - 16:00 PST

(Track2) Explaining Machine Learning Predictions: State-of-the-art, Challenges, and Opportunities

Himabindu Lakkaraju · Julius Adebayo · Sameer Singh

As machine learning is deployed in all aspects of society, it has become increasingly important to ensure stakeholders understand and trust these models. Decision makers must have a clear understanding of the model behavior so they can diagnose errors and potential biases in these models, and decide when and how to employ them. However, most accurate models that are deployed in practice are not interpretable, making it difficult for users to understand where the predictions are coming from, and thus, difficult to trust.

Recent work on explanation techniques in machine learning offers an attractive solution: they provide intuitive explanations for “any” machine learning model by approximating complex machine learning models with simpler ones.

In this tutorial, we will discuss several post hoc explanation methods, and focus on their advantages and shortcomings. We will cover three families of techniques: (a) single instance gradient-based attribution methods (saliency maps), (b) model agnostic explanations via perturbations, such as LIME/SHAP and counterfactual explanations, and (c) surrogate modeling for global interpretability, such as MUSE. For each of these approaches, we will provide their problem setup, prominent methods, example applications, and finally, discuss their vulnerabilities and shortcomings. We will conclude the tutorial with an overview of future directions and a discussion on open research problems. We hope to provide a practical and insightful introduction to explainability in machine learning.