Timezone: »
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data.Understanding this memorization is important in real world applications and also from a learning-theoretical perspective. An open question in previous studies of language model memorization is how to filter out ``common'' memorization. In fact, most memorization criteria strongly correlate with the number of occurrences in the training set, capturing memorized familiar phrases, public knowledge, templated texts, or other repeated data.We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.We identify and study counterfactually-memorized training examples in standard text datasets.We estimate the influence of each memorized training example on the validation set and on generated texts, showing how this can provide direct evidence of the source of memorization at test time.
Author Information
Chiyuan Zhang (Google Research)
Daphne Ippolito (School of Engineering and Applied Science, University of Pennsylvania)
Katherine Lee (Cornell University)
Matthew Jagielski (Google DeepMind)
Florian Tramer (ETH Zurich)
Nicholas Carlini (Google)
More from the Same Authors
-
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 : Understanding and Improving Robustness of VisionTransformers through patch-based NegativeAugmentation »
Yao Qin · Chiyuan Zhang · Ting Chen · Balaji Lakshminarayanan · Alex Beutel · Xuezhi Wang -
2023 : Is My Prediction Arbitrary? Confounding Effects of Variance in Fair Classification »
A. Feder Cooper · Katherine Lee · Madiha Choksi · Solon Barocas · Christopher De Sa · James Grimmelmann · Jon Kleinberg · Siddhartha Sen · Baobao Zhang -
2023 : Evaluating Superhuman Models with Consistency Checks »
Lukas Fluri · Daniel Paleka · Florian Tramer -
2023 Workshop: Socially Responsible Language Modelling Research (SoLaR) »
Usman Anwar · David Krueger · Samuel Bowman · Jakob Foerster · Su Lin Blodgett · Roberta Raileanu · Alan Chan · Katherine Lee · Laura Ruis · Robert Kirk · Yawen Duan · Xin Chen · Kawin Ethayarajh -
2023 : Is My Prediction Arbitrary? Confounding Effects of Variance in Fair Classification »
A. Feder Cooper · Katherine Lee · Madiha Choksi · Solon Barocas · Christopher De Sa · James Grimmelmann · Jon Kleinberg · Siddhartha Sen · Baobao Zhang -
2023 Poster: Sparsity-Preserving Differentially Private Training of Large Embedding Models »
Badih Ghazi · Yangsibo Huang · Pritish Kamath · Ravi Kumar · Pasin Manurangsi · Amer Sinha · Chiyuan Zhang -
2023 Poster: Students Parrot Their Teachers: Membership Inference on Model Distillation »
Matthew Jagielski · Milad Nasr · Katherine Lee · Christopher A. Choquette-Choo · Nicholas Carlini · Florian Tramer -
2023 Oral: Students Parrot Their Teachers: Membership Inference on Model Distillation »
Matthew Jagielski · Milad Nasr · Katherine Lee · Christopher A. Choquette-Choo · Nicholas Carlini · Florian Tramer -
2023 Poster: Optimal Unbiased Randomizers for Regression with Label Differential Privacy »
Ashwinkumar Badanidiyuru Varadaraja · Badih Ghazi · Pritish Kamath · Ravi Kumar · Ethan Leeman · Pasin Manurangsi · Avinash V Varadarajan · Chiyuan Zhang -
2023 Poster: Effective Robustness against Natural Distribution Shifts for Models with Different Training Data »
Zhouxing Shi · Nicholas Carlini · Ananth Balashankar · Ludwig Schmidt · Cho-Jui Hsieh · Alex Beutel · Yao Qin -
2023 Poster: Are aligned neural networks adversarially aligned? »
Nicholas Carlini · Milad Nasr · Christopher A. Choquette-Choo · Matthew Jagielski · Irena Gao · Pang Wei Koh · Daphne Ippolito · Florian Tramer · Ludwig Schmidt -
2023 Poster: Privacy Auditing with One (1) Training Run »
Thomas Steinke · Milad Nasr · Matthew Jagielski -
2023 Poster: User-Level Differential Privacy With Few Examples Per User »
Badih Ghazi · Pritish Kamath · Ravi Kumar · Pasin Manurangsi · Raghu Meka · Chiyuan Zhang -
2023 Oral: User-Level Differential Privacy With Few Examples Per User »
Badih Ghazi · Pritish Kamath · Ravi Kumar · Pasin Manurangsi · Raghu Meka · Chiyuan Zhang -
2023 Oral: Privacy Auditing with One (1) Training Run »
Thomas Steinke · Milad Nasr · Matthew Jagielski -
2022 Workshop: Workshop on Machine Learning Safety »
Dan Hendrycks · Victoria Krakovna · Dawn Song · Jacob Steinhardt · Nicholas Carlini -
2022 Poster: Handcrafted Backdoors in Deep Neural Networks »
Sanghyun Hong · Nicholas Carlini · Alexey Kurakin -
2022 Poster: Increasing Confidence in Adversarial Robustness Evaluations »
Roland S. Zimmermann · Wieland Brendel · Florian Tramer · Nicholas Carlini -
2022 Poster: Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation »
Yao Qin · Chiyuan Zhang · Ting Chen · Balaji Lakshminarayanan · Alex Beutel · Xuezhi Wang -
2022 Poster: The Privacy Onion Effect: Memorization is Relative »
Nicholas Carlini · Matthew Jagielski · Chiyuan Zhang · Nicolas Papernot · Andreas Terzis · Florian Tramer -
2022 Poster: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples »
Maura Pintor · Luca Demetrio · Angelo Sotgiu · Ambra Demontis · Nicholas Carlini · Battista Biggio · Fabio Roli -
2022 Poster: Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures »
Emmanuel Abbe · Samy Bengio · Elisabetta Cornacchia · Jon Kleinberg · Aryo Lotfi · Maithra Raghu · Chiyuan Zhang -
2021 : Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing »
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto -
2021 Poster: Antipodes of Label Differential Privacy: PATE and ALIBI »
Mani Malek Esmaeili · Ilya Mironov · Karthik Prasad · Igor Shilov · Florian Tramer -
2021 Poster: Deep Learning with Label Differential Privacy »
Badih Ghazi · Noah Golowich · Ravi Kumar · Pasin Manurangsi · Chiyuan Zhang -
2021 Poster: Do Vision Transformers See Like Convolutional Neural Networks? »
Maithra Raghu · Thomas Unterthiner · Simon Kornblith · Chiyuan Zhang · Alexey Dosovitskiy -
2020 Poster: What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation »
Vitaly Feldman · Chiyuan Zhang -
2020 Spotlight: What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation »
Vitaly Feldman · Chiyuan Zhang -
2020 Poster: On Adaptive Attacks to Adversarial Example Defenses »
Florian Tramer · Nicholas Carlini · Wieland Brendel · Aleksander Madry -
2020 Poster: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Poster: What is being transferred in transfer learning? »
Behnam Neyshabur · Hanie Sedghi · Chiyuan Zhang -
2020 Spotlight: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2019 Poster: Transfusion: Understanding Transfer Learning for Medical Imaging »
Maithra Raghu · Chiyuan Zhang · Jon Kleinberg · Samy Bengio -
2019 Poster: Adversarial Training and Robustness for Multiple Perturbations »
Florian Tramer · Dan Boneh -
2019 Spotlight: Adversarial Training and Robustness for Multiple Perturbations »
Florian Tramer · Dan Boneh -
2018 : Contributed talk 6: Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware »
Florian Tramer -
2018 Workshop: Workshop on Security in Machine Learning »
Nicolas Papernot · Jacob Steinhardt · Matt Fredrikson · Kamalika Chaudhuri · Florian Tramer