Timezone: »
As machine learning has become increasingly ubiquitous, there has been a growing need to assess the trustworthiness of learned models. One important aspect to model trust is conceptual soundness, i.e., the extent to which a model uses features that are appropriate for its intended task. We present TruLens, a new cross-platform framework for explaining deep network behavior. In our demonstration, we provide an interactive application built on TruLens that we use to explore the conceptual soundness of various pre-trained models. Throughout the presentation, we take the unique perspective that robustness to small-norm adversarial examples is a necessary condition for conceptual soundness; we demonstrate this by comparing explanations on models trained with and without a robust objective. Our demonstration will focus on our end-to-end application, which will be made accessible for the audience to interact with; but we will also provide details on its open-source components, including the TruLens library and the code used to train robust networks.
Author Information
Anupam Datta (Carnegie Mellon University)
Matt Fredrikson (Carnegie Mellon University)
Klas Leino (Carnegie Mellon University)
I'm a researcher at CMU focused on studying the weaknesses and vulnerabilities of deep learning; I works to improve DNN security, transparency, and privacy
Kaiji Lu (Carnegie Mellon University)
Shayak Sen (TruEra, Inc.)
Ricardo C Shih (Truera)
Zifan Wang (Carnegie Mellon University)
More from the Same Authors
-
2023 Poster: Grounding Neural Inference with Satisfiability Modulo Theories »
Matt Fredrikson · Kaiji Lu · Somesh Jha · Saranya Vijayakumar · Vijay Ganesh · Zifan Wang -
2023 Poster: Scaling in Depth: Unlocking Robustness Certification on ImageNet »
Kai Hu · Andy Zou · Zifan Wang · Klas Leino · Matt Fredrikson -
2021 Poster: Influence Patterns for Explaining Information Flow in BERT »
Kaiji Lu · Zifan Wang · Piotr Mardziel · Anupam Datta -
2021 Poster: Relaxing Local Robustness »
Klas Leino · Matt Fredrikson -
2020 Poster: Smoothed Geometry for Robust Attribution »
Zifan Wang · Haofan Wang · Shakul Ramkumar · Piotr Mardziel · Matt Fredrikson · Anupam Datta -
2018 Poster: Hunting for Discriminatory Proxies in Linear Regression Models »
Samuel Yeom · Anupam Datta · Matt Fredrikson