Timezone: »
A recent line of work has uncovered a new form of data poisoning: so-called backdoor attacks. These attacks are particularly dangerous because they do not affect a network's behavior on typical, benign data. Rather, the network only deviates from its expected output when triggered by an adversary's planted perturbation.
In this paper, we identify a new property of all known backdoor attacks, which we call spectral signatures. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures. We believe that understanding spectral signatures is a crucial first step towards a principled understanding of backdoor attacks.
Author Information
Brandon Tran (Massachusetts Institute of Technology)
Jerry Li (Berkeley)
Aleksander Madry (MIT)
Aleksander Madry is the NBX Associate Professor of Computer Science in the MIT EECS Department and a principal investigator in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). He received his PhD from MIT in 2011 and, prior to joining the MIT faculty, he spent some time at Microsoft Research New England and on the faculty of EPFL. Aleksander's research interests span algorithms, continuous optimization, science of deep learning and understanding machine learning from a robustness perspective. His work has been recognized with a number of awards, including an NSF CAREER Award, an Alfred P. Sloan Research Fellowship, an ACM Doctoral Dissertation Award Honorable Mention, and 2018 Presburger Award.
More from the Same Authors
-
2022 : A Unified Framework for Comparing Learning Algorithms »
Harshay Shah · Sung Min Park · Andrew Ilyas · Aleksander Madry -
2022 : Invited Talk: Aleksander Mądry »
Aleksander Madry -
2022 Poster: 3DB: A Framework for Debugging Computer Vision Models »
Guillaume Leclerc · Hadi Salman · Andrew Ilyas · Sai Vemprala · Logan Engstrom · Vibhav Vineet · Kai Xiao · Pengchuan Zhang · Shibani Santurkar · Greg Yang · Ashish Kapoor · Aleksander Madry -
2021 : Discussion: Aleksander Mądry, Ernest Mwebaze, Suchi Saria »
Aleksander Madry · Ernest Mwebaze · Suchi Saria -
2021 : ML Model Debugging: A Data Perspective »
Aleksander Madry -
2021 Poster: Unadversarial Examples: Designing Objects for Robust Vision »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Sai Vemprala · Aleksander Madry · Ashish Kapoor -
2021 Poster: Editing a classifier by rewriting its prediction rules »
Shibani Santurkar · Dimitris Tsipras · Mahalaxmi Elango · David Bau · Antonio Torralba · Aleksander Madry -
2020 : What Do Our Models Learn? »
Aleksander Madry -
2020 Poster: On Adaptive Attacks to Adversarial Example Defenses »
Florian Tramer · Nicholas Carlini · Wieland Brendel · Aleksander Madry -
2020 Poster: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2020 Oral: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2019 Workshop: Machine Learning with Guarantees »
Ben London · Gintare Karolina Dziugaite · Daniel Roy · Thorsten Joachims · Aleksander Madry · John Shawe-Taylor -
2019 Poster: Image Synthesis with a Single (Robust) Classifier »
Shibani Santurkar · Andrew Ilyas · Dimitris Tsipras · Logan Engstrom · Brandon Tran · Aleksander Madry -
2019 Poster: Adversarial Examples Are Not Bugs, They Are Features »
Andrew Ilyas · Shibani Santurkar · Dimitris Tsipras · Logan Engstrom · Brandon Tran · Aleksander Madry -
2019 Spotlight: Adversarial Examples Are Not Bugs, They Are Features »
Andrew Ilyas · Shibani Santurkar · Dimitris Tsipras · Logan Engstrom · Brandon Tran · Aleksander Madry -
2018 : Adversarial Vision Challenge: Shooting ML Models in the Dark: The Landscape of Blackbox Attacks »
Aleksander Madry -
2018 Poster: Byzantine Stochastic Gradient Descent »
Dan Alistarh · Zeyuan Allen-Zhu · Jerry Li -
2018 Poster: How Does Batch Normalization Help Optimization? »
Shibani Santurkar · Dimitris Tsipras · Andrew Ilyas · Aleksander Madry -
2018 Poster: Adversarially Robust Generalization Requires More Data »
Ludwig Schmidt · Shibani Santurkar · Dimitris Tsipras · Kunal Talwar · Aleksander Madry -
2018 Oral: How Does Batch Normalization Help Optimization? »
Shibani Santurkar · Dimitris Tsipras · Andrew Ilyas · Aleksander Madry -
2018 Spotlight: Adversarially Robust Generalization Requires More Data »
Ludwig Schmidt · Shibani Santurkar · Dimitris Tsipras · Kunal Talwar · Aleksander Madry -
2018 Tutorial: Adversarial Robustness: Theory and Practice »
J. Zico Kolter · Aleksander Madry -
2017 Poster: QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding »
Dan Alistarh · Demjan Grubic · Jerry Li · Ryota Tomioka · Milan Vojnovic -
2017 Poster: Communication-Efficient Distributed Learning of Discrete Distributions »
Ilias Diakonikolas · Elena Grigorescu · Jerry Li · Abhiram Natarajan · Krzysztof Onak · Ludwig Schmidt -
2017 Oral: Communication-Efficient Distributed Learning of Discrete Distributions »
Ilias Diakonikolas · Elena Grigorescu · Jerry Li · Abhiram Natarajan · Krzysztof Onak · Ludwig Schmidt -
2017 Spotlight: Communication-Efficient Stochastic Gradient Descent, with Applications to Neural Networks »
Dan Alistarh · Demjan Grubic · Jerry Li · Ryota Tomioka · Milan Vojnovic