Skip to yearly menu bar Skip to main content

Workshop: Backdoors in Deep Learning: The Good, the Bad, and the Ugly

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

Haonan Wang · Qianli Shen · Yao Tong · Yang Zhang · Kenji Kawaguchi

[ ] [ Project Page ]
Fri 15 Dec 11:15 a.m. PST — 11:30 a.m. PST


Diffusion models (DM) have increasingly demonstrated an ability to generate high-quality images, often indistinguishable from real ones. However, their complexity and vast parameter space have introduced potential copyright concerns. While measures have been introduced to prevent unauthorized access to copyrighted material, the efficacy of these solutions remains unverified. In this study, we examine the vulnerabilities associated with copyright in DMs, concentrating on the influence of backdoor data poisoning attacks during further fine-tuning on public datasets. We introduce \textbf{\ourmethod}, an innovative method for embedding backdoor data poisonings tailored for DMs. This method allows the finetuned models to recreate copyrighted images in response to particular trigger prompts by embedding components of copyrighted images across various images inconspicuously. In the inference process, DMs utilize their understanding of these prompts to regenerate the copyrighted images. Our empirical results indicate that the information of copyrighted data can be stealthily encoded into training data using \ourmethod, causing the fine-tuned DM to generate infringing content. These findings underline potential pitfalls in the prevailing copyright protection strategies and underscore the necessity for increased scrutiny and preventative measures against misuse of DMs.

Chat is not available.