Timezone: »

(Track2) Explaining Machine Learning Predictions: State-of-the-art, Challenges, and Opportunities
Himabindu Lakkaraju · Julius Adebayo · Sameer Singh

Mon Dec 07 01:30 PM -- 04:00 PM (PST) @

As machine learning is deployed in all aspects of society, it has become increasingly important to ensure stakeholders understand and trust these models. Decision makers must have a clear understanding of the model behavior so they can diagnose errors and potential biases in these models, and decide when and how to employ them. However, most accurate models that are deployed in practice are not interpretable, making it difficult for users to understand where the predictions are coming from, and thus, difficult to trust.

Recent work on explanation techniques in machine learning offers an attractive solution: they provide intuitive explanations for “any” machine learning model by approximating complex machine learning models with simpler ones.

In this tutorial, we will discuss several post hoc explanation methods, and focus on their advantages and shortcomings. We will cover three families of techniques: (a) single instance gradient-based attribution methods (saliency maps), (b) model agnostic explanations via perturbations, such as LIME/SHAP and counterfactual explanations, and (c) surrogate modeling for global interpretability, such as MUSE. For each of these approaches, we will provide their problem setup, prominent methods, example applications, and finally, discuss their vulnerabilities and shortcomings. We will conclude the tutorial with an overview of future directions and a discussion on open research problems. We hope to provide a practical and insightful introduction to explainability in machine learning.

Author Information

Himabindu Lakkaraju (Harvard)

Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io/

Julius Adebayo (MIT)

Julius Adebayo is a Ph.D. student at MIT working on developing and understanding approaches that seek to make machine learning-based systems reliable when deployed. More broadly, he is interested in rigorous approaches to help develop models that are robust to spurious associations, distribution shifts, and align with 'human' values. Website: https://juliusadebayo.com/

Sameer Singh (University of California, Irvine)

Sameer Singh is an Assistant Professor at UC Irvine working on robustness and interpretability of machine learning. Sameer has presented tutorials and invited workshop talks at EMNLP, Neurips, NAACL, WSDM, ICLR, ACL, and AAAI, and received paper awards at KDD 2016, ACL 2018, EMNLP 2019, AKBC 2020, and ACL 2020. Website: http://sameersingh.org/

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors