The field of astrophysics has been an avid consumer—and also a developer—of new methods in data science (maybe even dating back to the invention of Bayesian inference). With constantly growing data volumes, increasingly complex and costly physical models, and demand for extremely precise measurements, astrophysics presents opportunities for innovation in machine learning (ML) methods.
In this tutorial, we will give a sense of the myriad connections between astrophysics and ML, and demonstrate that astrophysics is an ideal sandbox for developing and testing ML applications and innovations. We will also discuss areas where vanilla ML methods fail or require extension or elaboration to be competitive with traditional astronomy techniques.
Astronomical data falls into four broad types: imaging, spectroscopy, time series, and catalogs. We will discuss the scientific understandings and precise measurements that we hope to obtain from these data sets, the challenges specific to each of them, and the successes and opportunities for ML applications in these domains. We will demonstrate how to obtain and start working with current leading-edge public data sets of each type. Participants should expect to do hands-on work with the data during the tutorial (we’ll demo with Python and Jupyter, but any platform can play). By the end, we hope that participants will be able to download, visualize, and apply ML algorithms to astronomical data, in ways relevant to current research directions in astrophysics. DWH and KSF thank the members of the Astronomical Data Group at the Flatiron Institute for support with the ideas, code, and content in this tutorial.