Skip to yearly menu bar Skip to main content

Oral Poster

Students Parrot Their Teachers: Membership Inference on Model Distillation

Matthew Jagielski · Milad Nasr · Katherine Lee · Christopher A. Choquette-Choo · Nicholas Carlini · Florian Tramer

Great Hall & Hall B1+B2 (level 1) #1610
[ ]
Thu 14 Dec 8:45 a.m. PST — 10:45 a.m. PST
Oral presentation: Oral 5B Privacy/Fairness
Thu 14 Dec 8 a.m. PST — 8:45 a.m. PST


Model distillation is frequently proposed as a technique to reduce the privacy leakage of machine learning. These empirical privacy defenses rely on the intuition that distilled student'' models protect the privacy of training data, as they only interact with this data indirectly through ateacher'' model. In this work, we design membership inference attacks to systematically study the privacy provided by knowledge distillation to both the teacher and student training sets. Our new attacks show that distillation alone provides only limited privacy across a number of domains. We explain the success of our attacks on distillation by showing that membership inference attacks on a private dataset can succeed even if the target model is never queried on any actual training points, but only on inputs whose predictions are highly influenced by training data. Finally, we show that our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.

Chat is not available.