Timezone: »

Poster
Autoregressive Perturbations for Data Poisoning
Pedro Sandoval-Segura · Vasu Singla · Jonas Geiping · Micah Goldblum · Tom Goldstein · David Jacobs

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #408

The prevalence of data scraping from social media as a means to obtain datasets has led to growing concerns regarding unauthorized use of data. Data poisoning attacks have been proposed as a bulwark against scraping, as they make data unlearnable'' by adding small, imperceptible perturbations. Unfortunately, existing methods require knowledge of both the target architecture and the complete dataset so that a surrogate network can be trained, the parameters of which are used to generate the attack. In this work, we introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset. The proposed AR perturbations are generic, can be applied across different datasets, and can poison different architectures. Compared to existing unlearnable methods, our AR poisons are more resistant against common defenses such as adversarial training and strong data augmentations. Our analysis further provides insight into what makes an effective data poison.

#### Author Information

##### Pedro Sandoval-Segura (University of Maryland, College Park)

I am currently a PhD student at the University of Maryland at College Park, where I am advised by Prof. David Jacobs and Prof. Tom Goldstein. I am broadly interested in computer vision and deep learning research. Lately, my research focuses on adversarial examples, adversarial training, and data poisoning.

##### Vasu Singla (University of Maryland)

I am a 3rd year Grad Student at the University of Maryland, interested in adversarial robustness.

##### Jonas Geiping (University of Maryland, College Park)

Jonas is a postdoctoral researcher at UMD. His background is in Mathematics, more specifically in mathematical optimization and its applications to deep learning. His current focus is on designing more secure and private ML systems, especially for federated learning, and on understanding fundamental phenomena behind generalization.