Timezone: »
Hundreds of defenses have been proposed to make deep neural networks robust against minimal (adversarial) input perturbations. However, only a handful of these defenses held up their claims because correctly evaluating robustness is extremely challenging: Weak attacks often fail to find adversarial examples even if they unknowingly exist, thereby making a vulnerable network look robust. In this paper, we propose a test to identify weak attacks and, thus, weak defense evaluations. Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample. Consequentially, any correct attack must succeed in breaking this modified network. For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it. We hope that attack unit tests - such as ours - will be a major component in future robustness evaluations and increase confidence in an empirical field that is currently riddled with skepticism.
Author Information
Roland S. Zimmermann (University of Tübingen, International Max Planck Research School for Intelligent Systems)
Wieland Brendel (AG Bethge, University of Tübingen)
Florian Tramer (ETH Zurich)
Nicholas Carlini (Google)
More from the Same Authors
-
2021 Spotlight: How Well do Feature Visualizations Support Causal Understanding of CNN Activations? »
Roland S. Zimmermann · Judy Borowski · Robert Geirhos · Matthias Bethge · Thomas Wallis · Wieland Brendel -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 : Score-Based Generative Classifiers »
Roland S. Zimmermann · Lukas Schott · Yang Song · Benjamin Dunn · David Klindt -
2021 : Score-Based Generative Classifiers »
Roland S. Zimmermann · Lukas Schott · Yang Song · Benjamin Dunn · David Klindt -
2022 Workshop: Workshop on Machine Learning Safety »
Dan Hendrycks · Victoria Krakovna · Dawn Song · Jacob Steinhardt · Nicholas Carlini -
2022 Spotlight: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2022 Poster: Handcrafted Backdoors in Deep Neural Networks »
Sanghyun Hong · Nicholas Carlini · Alexey Kurakin -
2022 Poster: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2022 Poster: The Privacy Onion Effect: Memorization is Relative »
Nicholas Carlini · Matthew Jagielski · Chiyuan Zhang · Nicolas Papernot · Andreas Terzis · Florian Tramer -
2022 Poster: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples »
Maura Pintor · Luca Demetrio · Angelo Sotgiu · Ambra Demontis · Nicholas Carlini · Battista Biggio · Fabio Roli -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 Poster: Antipodes of Label Differential Privacy: PATE and ALIBI »
Mani Malek Esmaeili · Ilya Mironov · Karthik Prasad · Igor Shilov · Florian Tramer -
2021 Poster: How Well do Feature Visualizations Support Causal Understanding of CNN Activations? »
Roland S. Zimmermann · Judy Borowski · Robert Geirhos · Matthias Bethge · Thomas Wallis · Wieland Brendel -
2021 Oral: Partial success in closing the gap between human and machine vision »
Robert Geirhos · Kantharaju Narayanappa · Benjamin Mitzkus · Tizian Thieringer · Matthias Bethge · Felix A. Wichmann · Wieland Brendel -
2021 Poster: Partial success in closing the gap between human and machine vision »
Robert Geirhos · Kantharaju Narayanappa · Benjamin Mitzkus · Tizian Thieringer · Matthias Bethge · Felix A. Wichmann · Wieland Brendel -
2021 Poster: Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style »
Julius von Kügelgen · Yash Sharma · Luigi Gresele · Wieland Brendel · Bernhard Schölkopf · Michel Besserve · Francesco Locatello -
2021 Poster: Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints »
Maura Pintor · Fabio Roli · Wieland Brendel · Battista Biggio -
2020 Poster: On Adaptive Attacks to Adversarial Example Defenses »
Florian Tramer · Nicholas Carlini · Wieland Brendel · Aleksander Madry -
2020 Poster: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Spotlight: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Poster: Improving robustness against common corruptions by covariate shift adaptation »
Steffen Schneider · Evgenia Rusak · Luisa Eck · Oliver Bringmann · Wieland Brendel · Matthias Bethge -
2019 Poster: Learning from brains how to regularize machines »
Zhe Li · Wieland Brendel · Edgar Walker · Erick Cobos · Taliah Muhammad · Jacob Reimer · Matthias Bethge · Fabian Sinz · Xaq Pitkow · Andreas Tolias -
2019 Poster: Adversarial Training and Robustness for Multiple Perturbations »
Florian Tramer · Dan Boneh -
2019 Spotlight: Adversarial Training and Robustness for Multiple Perturbations »
Florian Tramer · Dan Boneh -
2019 Poster: Accurate, reliable and fast robustness evaluation »
Wieland Brendel · Jonas Rauber · Matthias Kümmerer · Ivan Ustyuzhaninov · Matthias Bethge -
2018 : Contributed talk 6: Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware »
Florian Tramer -
2018 : Adversarial Vision Challenge: Results of the Adversarial Vision Challenge »
Wieland Brendel · Jonas Rauber · Marcel Salathé · Alexey Kurakin · Nicolas Papernot · Sharada Mohanty · Matthias Bethge -
2018 Workshop: Workshop on Security in Machine Learning »
Nicolas Papernot · Jacob Steinhardt · Matt Fredrikson · Kamalika Chaudhuri · Florian Tramer