Timezone: »
Knowing when a classifier's prediction can be trusted is useful in many applications and critical for safely using AI. While the bulk of the effort in machine learning research has been towards improving classifier performance, understanding when a classifier's predictions should and should not be trusted has received far less attention. The standard approach is to use the classifier's discriminant or confidence score; however, we show there exists an alternative that is more effective in many situations. We propose a new score, called the {\it trust score}, which measures the agreement between the classifier and a modified nearest-neighbor classifier on the testing example. We show empirically that high (low) trust scores produce surprisingly high precision at identifying correctly (incorrectly) classified examples, consistently outperforming the classifier's confidence score as well as many other baselines. Further, under some mild distributional assumptions, we show that if the trust score for an example is high (low), the classifier will likely agree (disagree) with the Bayes-optimal classifier. Our guarantees consist of non-asymptotic rates of statistical consistency under various nonparametric settings and build on recent developments in topological data analysis.
Author Information
Heinrich Jiang (Google Research)
Been Kim (Google)
Melody Guan (Stanford University)
Maya Gupta (Google)
More from the Same Authors
-
2021 : An Empirical Study of Pre-trained Models on Out-of-distribution Generalization »
Yaodong Yu · Heinrich Jiang · Dara Bahri · Hossein Mobahi · Seungyeon Kim · Ankit Rawat · Andreas Veit · Yi Ma -
2021 : Interpretability of Machine Learning in Computer Systems: Analyzing a Caching Model »
Leon Sixt · Evan Liu · Marie Pellat · James Wexler · Milad Hashemi · Been Kim · Martin Maas -
2020 Poster: Debugging Tests for Model Explanations »
Julius Adebayo · Michael Muelly · Ilaria Liccardi · Been Kim -
2020 Poster: On Completeness-aware Concept-Based Explanations in Deep Neural Networks »
Chih-Kuan Yeh · Been Kim · Sercan Arik · Chun-Liang Li · Tomas Pfister · Pradeep Ravikumar -
2020 Poster: Faster DBSCAN via subsampled similarity queries »
Heinrich Jiang · Jennifer Jang · Jakub Lacki -
2019 Poster: Optimizing Generalized Rate Metrics with Three Players »
Harikrishna Narasimhan · Andrew Cotter · Maya Gupta -
2019 Poster: Towards Automatic Concept-based Explanations »
Amirata Ghorbani · James Wexler · James Zou · Been Kim -
2019 Oral: Optimizing Generalized Rate Metrics with Three Players »
Harikrishna Narasimhan · Andrew Cotter · Maya Gupta -
2019 Poster: On Making Stochastic Classifiers Deterministic »
Andrew Cotter · Maya Gupta · Harikrishna Narasimhan -
2019 Oral: On Making Stochastic Classifiers Deterministic »
Andrew Cotter · Maya Gupta · Harikrishna Narasimhan -
2019 Poster: Visualizing and Measuring the Geometry of BERT »
Emily Reif · Ann Yuan · Martin Wattenberg · Fernanda Viegas · Andy Coenen · Adam Pearce · Been Kim -
2019 Poster: A Benchmark for Interpretability Methods in Deep Neural Networks »
Sara Hooker · Dumitru Erhan · Pieter-Jan Kindermans · Been Kim -
2018 : Accepted papers »
Sven Gowal · Bogdan Kulynych · Marius Mosbach · Nicholas Frosst · Phil Roth · Utku Ozbulak · Simral Chaudhary · Toshiki Shibahara · Salome Viljoen · Nikita Samarin · Briland Hitaj · Rohan Taori · Emanuel Moss · Melody Guan · Lukas Schott · Angus Galloway · Anna Golubeva · Xiaomeng Jin · Felix Kreuk · Akshayvarun Subramanya · Vipin Pillai · Hamed Pirsiavash · Giuseppe Ateniese · Ankita Kalra · Logan Engstrom · Anish Athalye -
2018 : Interpretability for when NOT to use machine learning by Been Kim »
Been Kim -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Diminishing Returns Shape Constraints for Interpretability and Regularization »
Maya Gupta · Dara Bahri · Andrew Cotter · Kevin Canini -
2018 Poster: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2018 Spotlight: Sanity Checks for Saliency Maps »
Julius Adebayo · Justin Gilmer · Michael Muelly · Ian Goodfellow · Moritz Hardt · Been Kim -
2017 Poster: Deep Lattice Networks and Partial Monotonic Functions »
Seungil You · David Ding · Kevin Canini · Jan Pfeifer · Maya Gupta -
2016 Poster: Launch and Iterate: Reducing Prediction Churn »
Mahdi Milani Fard · Quentin Cormier · Kevin Canini · Maya Gupta -
2016 Poster: Fast and Flexible Monotonic Functions with Ensembles of Lattices »
Mahdi Milani Fard · Kevin Canini · Andrew Cotter · Jan Pfeifer · Maya Gupta -
2016 Poster: Satisfying Real-world Goals with Dataset Constraints »
Gabriel Goh · Andrew Cotter · Maya Gupta · Michael P Friedlander