Timezone: »
Poster
Towards Automatic Concept-based Explanations
Amirata Ghorbani · James Wexler · James Zou · Been Kim
Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #86
Interpretability has become an important topic of research as more machine learning (ML) models are deployed and widely used to make important decisions.
Most of the current explanation methods provide explanations through feature importance scores, which identify features that are important for each individual input. However, how to systematically summarize and interpret such per sample feature importance scores itself is challenging. In this work, we propose principles and desiderata for \emph{concept} based explanation, which goes beyond per-sample features to identify higher level human-understandable concepts that apply across the entire dataset. We develop a new algorithm, ACE, to automatically extract visual concepts. Our systematic experiments demonstrate that \alg discovers concepts that are human-meaningful, coherent and important for the neural network's predictions.
Author Information
Amirata Ghorbani (Stanford University)
James Wexler
James Zou (Stanford University)
Been Kim (Google)
More from the Same Authors
-
2022 : Protein structure generation via folding diffusion »
Kevin Wu · Kevin Yang · Rianne van den Berg · James Zou · Alex X Lu · Ava Soleimany -
2022 : Predicting Immune Escape with Pretrained Protein Language Model Embeddings »
Kyle Swanson · Howard Chang · James Zou -
2022 : DrML: Diagnosing and Rectifying Vision Models using Language »
Yuhui Zhang · Jeff Z. HaoChen · Shih-Cheng Huang · Kuan-Chieh Wang · James Zou · Serena Yeung -
2021 : Interpretability of Machine Learning in Computer Systems: Analyzing a Caching Model »
Leon Sixt · Evan Liu · Marie Pellat · James Wexler · Milad Hashemi · Been Kim · Martin Maas -
2020 Poster: Debugging Tests for Model Explanations »
Julius Adebayo · Michael Muelly · Ilaria Liccardi · Been Kim -
2020 Poster: Neuron Shapley: Discovering the Responsible Neurons »
Amirata Ghorbani · James Zou -
2020 Poster: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou -
2020 Oral: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou -
2020 Poster: On Completeness-aware Concept-Based Explanations in Deep Neural Networks »
Chih-Kuan Yeh · Been Kim · Sercan Arik · Chun-Liang Li · Tomas Pfister · Pradeep Ravikumar -
2020 Poster: MOPO: Model-based Offline Policy Optimization »
Tianhe Yu · Garrett Thomas · Lantao Yu · Stefano Ermon · James Zou · Sergey Levine · Chelsea Finn · Tengyu Ma -
2019 : Phenotype »
Nir HaCohen · David Reshef · Matthew Johnson · Sam Morris · Aurel Nagy · Gokcen Eraslan · Meromit Singer · Eliezer Van Allen · Smita Krishnaswamy · Casey Greene · Scott Linderman · Alexander Wiltschko · Dylan Kotliar · James Zou · Brendan Bulik-Sullivan -
2019 Poster: Visualizing and Measuring the Geometry of BERT »
Emily Reif · Ann Yuan · Martin Wattenberg · Fernanda Viegas · Andy Coenen · Adam Pearce · Been Kim -
2019 Poster: A Benchmark for Interpretability Methods in Deep Neural Networks »
Sara Hooker · Dumitru Erhan · Pieter-Jan Kindermans · Been Kim -
2018 : Interpretability for when NOT to use machine learning by Been Kim »
Been Kim -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Spotlight: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Poster: To Trust Or Not To Trust A Classifier »
Heinrich Jiang · Been Kim · Melody Guan · Maya Gupta -
2018 Poster: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders »
Abubakar Abid · James Zou -
2017 : Poster Spotlights I »
Taesik Na · Yang Song · Aman Sinha · Richard Shin · Qiuyuan Huang · Nina Narodytska · Matt Staib · Kexin Pei · Fnu Suya · Amirata Ghorbani · Jacob Buckman · Matthias Hein · Huan Zhang · Yanjun Qi · Yuan Tian · Min Du · Dimitris Tsipras