Timezone: »
Protecting the privacy of user data is crucial for text generation models, which can leak sensitive information during generation. Differentially private (DP) learning methods provide guarantees against identifying the existence of a training sample from model outputs. PATE is a recent DP learning algorithm that achieves high utility with strong privacy protection on training samples. However, text generation models output tokens sequentially in a large output space; the classic PATE algorithm is not customized for this setting. Furthermore, PATE works well to protect sample-level privacy, but is not designed to protect phrases in samples. In this paper, we propose SeqPATE, an extension of PATE to text generation that protects the privacy of individual training samples and sensitive phrases in training data. To adapt PATE to text generation, we generate pseudo-contexts and reduce the sequence generation problem to a next-word prediction problem. To handle the large output space, we propose a candidate filtering strategy to dynamically reduce the output space, and refine the teacher aggregation of PATE to avoid low agreement due to voting for a large number of candidates. To further reduce privacy losses, we use knowledge distillation to reduce the number of teacher queries. The experiments verify the effectiveness of SeqPATE in protecting both training samples and sensitive phrases.
Author Information
Zhiliang Tian (Naional University of Defense Technology)
Yingxiu Zhao (The Hong Kong University of Science and Technology)
Ziyue Huang (Hong Kong University of Science and Technology)
Yu-Xiang Wang (UC Santa Barbara)
Nevin L. Zhang (HKUST)
He He (NYU)
More from the Same Authors
-
2021 : Instance-dependent Offline Reinforcement Learning: From tabular RL to linear MDPs »
Ming Yin · Yu-Xiang Wang -
2022 : Generalized PTR: User-Friendly Recipes for Data-Adaptive Algorithms with Differential Privacy »
Rachel Redberg · Yuqing Zhu · Yu-Xiang Wang -
2022 : VOTING-BASED APPROACHES FOR DIFFERENTIALLY PRIVATE FEDERATED LEARNING »
Yuqing Zhu · Xiang Yu · Yi-Hsuan Tsai · Francesco Pittaluga · Masoud Faraki · Manmohan Chandraker · Yu-Xiang Wang -
2022 : Offline Reinforcement Learning with Closed-Form Policy Improvement Operators »
Jiachen Li · Edwin Zhang · Ming Yin · Qinxun Bai · Yu-Xiang Wang · William Yang Wang -
2022 : Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data »
Sunil Madhow · Dan Qiao · Yu-Xiang Wang -
2022 : Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation »
Dan Qiao · Yu-Xiang Wang -
2022 : Differentially Private Gradient Boosting on Linear Learners for Tabular Data »
Saeyoung Rho · Shuai Tang · Sergul Aydore · Michael Kearns · Aaron Roth · Yu-Xiang Wang · Steven Wu · Cedric Archambeau -
2022 : Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2023 Poster: Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2023 Poster: Offline Reinforcement Learning with Differential Privacy »
Dan Qiao · Yu-Xiang Wang -
2023 Poster: Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation »
Lijing Kuang · Ming Yin · Mengdi Wang · Yu-Xiang Wang · Yian Ma -
2023 Poster: Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples »
Abulhair Saparov · Yuanzhe Pang · Vishakh Padmakumar · Nitish Joshi · Seyed Mehran Kazemi · Najoung Kim · He He -
2023 Poster: Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms »
Dheeraj Baby · Saurabh Garg · Tzu-Ching Yen · Sivaraman Balakrishnan · Zachary Lipton · Yu-Xiang Wang -
2023 Poster: Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners »
Rachel Redberg · Antti Koskela · Yu-Xiang Wang -
2023 Poster: A Privacy-Friendly Approach to Data Valuation »
Jiachen T. Wang · Yuqing Zhu · Yu-Xiang Wang · Ruoxi Jia · Prateek Mittal -
2022 : Contributed Talk: Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2022 : Panel on Privacy and Security in Machine Learning Systems »
Graham Cormode · Borja Balle · Yu-Xiang Wang · Alejandro Saucedo · Neil Lawrence -
2022 : Practical differential privacy »
Yu-Xiang Wang · Fariba Yousefi -
2022 : Practical differential privacy »
Yu-Xiang Wang -
2022 Poster: Differentially Private Linear Sketches: Efficient Implementations and Applications »
Fuheng Zhao · Dan Qiao · Rachel Redberg · Divyakant Agrawal · Amr El Abbadi · Yu-Xiang Wang -
2022 Poster: Optimal Dynamic Regret in LQR Control »
Dheeraj Baby · Yu-Xiang Wang -
2021 Workshop: Privacy in Machine Learning (PriML) 2021 »
Yu-Xiang Wang · Borja Balle · Giovanni Cherubin · Kamalika Chaudhuri · Antti Honkela · Jonathan Lebensold · Casey Meehan · Mi Jung Park · Adrian Weller · Yuqing Zhu -
2021 Poster: IRM—when it works and when it doesn't: A test case of natural language inference »
Yana Dranker · He He · Yonatan Belinkov -
2021 Poster: Instance-optimal Mean Estimation Under Differential Privacy »
Ziyue Huang · Yuting Liang · Ke Yi -
2020 Workshop: Privacy Preserving Machine Learning - PriML and PPML Joint Edition »
Borja Balle · James Bell · Aurélien Bellet · Kamalika Chaudhuri · Adria Gascon · Antti Honkela · Antti Koskela · Casey Meehan · Olga Ohrimenko · Mi Jung Park · Mariana Raykova · Mary Anne Smart · Yu-Xiang Wang · Adrian Weller -
2019 Poster: Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation »
zengfeng Huang · Ziyue Huang · Yilei WANG · Ke Yi -
2017 : Competition V: Human-Computer Question Answering »
Jordan Boyd-Graber · Hal Daumé III · He He · Mohit Iyyer · Pedro Rodriguez -
2016 Poster: A Credit Assignment Compiler for Joint Prediction »
Kai-Wei Chang · He He · Stephane Ross · Hal Daumé III · John Langford -
2014 Poster: Learning to Search in Branch and Bound Algorithms »
He He · Hal Daumé III · Jason Eisner -
2012 Poster: Imitation Learning by Coaching »
He He · Hal Daumé III · Jason Eisner