Timezone: »

Defend Against Textual Backdoor Attacks By Token Substitution
Xinglin Li · Yao Li · Minhao Cheng
Event URL: https://openreview.net/forum?id=irMklrzJDr7 »

Backdoor attacks are a type of malicious threat to deep neural networks (DNNs). The attacker injects a trigger into the model during the training process. The victim model behaves normally on data without the backdoor attack trigger but gives a prediction the same as the attacker-specified target. Backdoor attacks were first investigated in computer vision. The investigation of backdoor attacks has also emerged in natural language processing (NLP) recently. However, the study of defense methods against textual backdoor attacks is still insufficient. Especially, there are not enough methods available to protect against backdoor attacks using syntax as the trigger. In this paper, we propose a novel method that can effectively defend against syntactic backdoor attacks. Experiments show the effectiveness of our method on BERT for syntactic backdoor attacks when choosing five different syntaxes as triggers.

Author Information

Xinglin Li (University of North Carolina at Chapel Hill)
Yao Li (University of North Carolina at Chapel Hill)
Yao Li

I am an assistant professor of Statistics at UNC Chapel Hill. I was a Ph.D. student at UC Davis working with Prof. Cho-Jui Hsieh and Prof. Thomas C.M. Lee. I received my master degree in London School of Economics and Political Science under supervision of Prof. Piotr Fryzlewicz. My research focuses on developing new algorithms to resolve the real-world difficulties in the machine learning pipeline. I study both statistical and computational aspects of machine learning models. I am interested in developing new models with statistical guarantees, such as recommeder systems, factorial machine and fiducial inference. Currently, I am working on adversarial examples, trying to improve the robustness of deep neural networks.

Minhao Cheng (Hong Kong University of Science and Technology)

More from the Same Authors