Timezone: »
The goal of our workshop is to bring together privacy experts working in academia and industry to discuss the present and future of technologies that enable machine learning with privacy. The workshop will focus on the technical aspects of privacy research and deployment with invited and contributed talks by distinguished researchers in the area. By design, the workshop should serve as a meeting point for regular NeurIPS attendees interested/working on privacy to meet other parts of the privacy community (security researchers, legal scholars, industry practitioners). The focus this year will include emerging problems such as machine unlearning, privacy-fairness tradeoffs and legal challenges in recent deployments of differential privacy (e.g. that of the US Census Bureau). We will conclude the workshop with a panel discussion titled: “Machine Learning and Privacy in Practice: Challenges, Pitfalls and Opportunities”. A diverse set of panelists will address the challenges faced applying these technologies to the real world. The programme of the workshop will emphasize the diversity of points of view on the problem of privacy. We will also ensure that there is ample time for discussions that encourage networking between researchers, which should result in mutually beneficial new long-term collaborations.
Tue 1:20 a.m. - 1:30 a.m.
|
Introduction
(
Opening
)
SlidesLive Video » |
🔗 |
Tue 1:30 a.m. - 2:00 a.m.
|
Invited talk: Emiliano de Cristofaro (University College London) --- Privacy in Machine Learning -- It's Complicated
(
Invited talk
)
SlidesLive Video » |
Emiliano De Cristofaro 🔗 |
Tue 2:00 a.m. - 2:15 a.m.
|
Emiliano Q&A
(
Q&A
)
|
🔗 |
Tue 2:15 a.m. - 2:30 a.m.
|
Coffee break
(
coffee break
)
|
🔗 |
Tue 2:30 a.m. - 2:45 a.m.
|
Differential Privacy via Group Shuffling
(
Contributed talk
)
link »
SlidesLive Video » The past decade has seen data privacy emerge as a fundamental and pressing issue. Among the tools developed to tackle it, differential privacy has emerged as a central and principled framework, with specific variants capturing various threat models. In particular, the recently proposed shuffle model of differential privacy allows for promising tradeoffs between accuracy and privacy. However, the shuffle model may not be suitable in all situations, as it relies on a distributed setting where all users can coordinate and trust (or simulate) a joint shuffling algorithm. To address this, we introduce a new model, the group shuffle model, in which users are partitioned into several groups, each group having its own local shuffler. We investigate the privacy/accuracy tradeoffs in our model, by comparing it to both the shuffle and local models of privacy, which it some sense interpolates between. In addition to general relations between group shuffle, shuffle, and local privacy, we provide a detailed comparison of the cost and benefit of the group shuffle model, by providing both upper and lower bounds for the specific task of binary summation. |
Amir Mohammad Abouei · Clement Canonne 🔗 |
Tue 2:45 a.m. - 3:00 a.m.
|
SoK: Privacy-preserving Clustering (Extended Abstract)
(
Contributed talk
)
link »
SlidesLive Video » Clustering is a popular unsupervised machine learning technique that groups similar input elements into clusters. In many applications, sensitive information is clustered that should not be leaked. Moreover, nowadays it is often required to combine data from multiple sources to increase the quality of the analysis as well as to outsource complex computation to powerful cloud servers. This calls for efficient privacy-preserving clustering. In this work, we systematically analyze the state-of-the-art in privacy-preserving clustering. We implement and benchmark today's four most efficient fully private clustering protocols by Cheon et al. (SAC'19), Meng et al. (ArXiv'19), Mohassel et al. (PETS'20), and Bozdemir et al. (ASIACCS'21) with respect to communication, computation, and clustering quality. |
Helen Möllering · Hossein Yalame · Thomas Schneider · Aditya Hegde 🔗 |
Tue 3:00 a.m. - 3:15 a.m.
|
Contributed talk Q&A
(
Q&A
)
SlidesLive Video » |
🔗 |
Tue 3:15 a.m. - 3:30 a.m.
|
Coffee Break
|
🔗 |
Tue 3:30 a.m. - 4:30 a.m.
|
Poster Session ( Poster ) link » | 🔗 |
Tue 4:30 a.m. - 5:15 a.m.
|
Panel
SlidesLive Video » |
Catuscia Palamidessi · Carmela Troncoso · Yang Zhang 🔗 |
Tue 8:20 a.m. - 8:30 a.m.
|
Introduction
(
Opening
)
|
🔗 |
Tue 8:30 a.m. - 9:00 a.m.
|
Invited talk: Helen Nissenbaum (Cornell Tech) --- Practical Privacy, Fairness, Ethics, Policy
(
Invited talk
)
SlidesLive Video » |
Helen Nissenbaum 🔗 |
Tue 9:00 a.m. - 9:30 a.m.
|
Invited talk: Aaron Roth (UPenn / Amazon): Machine Unlearning.
(
Invited talk
)
SlidesLive Video » |
Aaron Roth 🔗 |
Tue 9:30 a.m. - 10:00 a.m.
|
Q&A for Helen and Aaron
(
Q&A
)
SlidesLive Video » |
🔗 |
Tue 10:00 a.m. - 10:15 a.m.
|
Coffee break
|
🔗 |
Tue 10:15 a.m. - 11:15 a.m.
|
Poster Session ( Gather.Town ) link » | 🔗 |
Tue 11:15 a.m. - 11:30 a.m.
|
Coffee break
|
🔗 |
Tue 11:30 a.m. - 12:00 p.m.
|
Invited talk: Kristin Lauter (Facebook AI Research): ML on Encrypted Data.
(
Invited talk
)
SlidesLive Video » |
Kristin E. Lauter 🔗 |
Tue 12:00 p.m. - 12:15 p.m.
|
Q&A for Kristin
(
Q&A
)
|
🔗 |
Tue 12:15 p.m. - 12:30 p.m.
|
Privacy-Aware Rejection Sampling
(
Contributed talk
)
link »
SlidesLive Video »
Differential privacy (DP) offers strong protection against adversaries with arbitrary side-information and computational power. However, many implementations of DP mechanisms leave themselves vulnerable to side channel attacks, such as timing attacks. As many privacy mechanisms, such as the exponential mechanism, do not lend themselves to easy implementations, when sampling methods such as MCMC or rejection sampling are used, the runtime can leak privacy. In this work, we quantify the privacy cost due to the runtime of a rejection sampler in terms of $(\epsilon,\delta)$-DP. We also propose three modifications to the rejection sampling algorithm, to protect against timing attacks by making the runtime independent of the data. We also use our techniques to develop an adaptive rejection sampler for log-Holder densities, which also has data-independent runtime.
|
Jordan Awan · Vinayak Rao 🔗 |
Tue 12:30 p.m. - 12:45 p.m.
|
Population Level Privacy Leakage in Binary Classification wtih Label Noise
(
Contributed talk
)
link »
SlidesLive Video » We study the privacy limitations of label differential privacy. Label differential privacy has emerged as an intermediate trust model between local and central differential privacy, where only the label of each training example is protected (and the features are assumed to be public). We show that the guarantees provided by label DP are significantly weaker than they appear, as an adversary can "un-noise" the perturbed labels. Formally we show that the privacy loss has a close connection with Jeffreys' divergence of the conditional distribution between positive and negative labels, which allows explicit formulation of the trade-off between utility and privacy in this setting. Our results suggest how to select public features that optimize this trade-off. But we still show that there is no free lunch --- instances where label differential privacy guarantees are strong are exactly those where a good classifier does not exist. We complement the negative results with a non-parametric estimator for the true privacy loss, and apply our techniques on large-scale benchmark data to demonstrate how toachieve a desired privacy protection. |
Róbert Busa-Fekete · Andres Munoz Medina · Umar Syed · Sergei Vassilvitskii 🔗 |
Tue 12:45 p.m. - 1:00 p.m.
|
Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing
(
Contributed talk
)
link »
SlidesLive Video » Differentially private learning has seen limited success for deep learning models of text, resulting in a perception that differential privacy may be incompatible with the language model fine-tuning paradigm. We demonstrate that this perception is inaccurate and that with the right setup, high performing private models can be learned on moderately-sized corpora by directly fine-tuning with differentially private optimization.Our work highlights the important role of hyperparameters, task formulations, and pretrained models.Our analyses also show that the low performance of naive differentially private baselines in prior work is attributable to suboptimal choices in these factors.Empirical results reveal that differentially private optimization does not suffer from dimension-dependent performance degradation with pretrained models and achieves performance on-par with state-of-the-art private training procedures and strong non-private baselines. |
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto 🔗 |
Tue 1:00 p.m. - 1:15 p.m.
|
Canonical Noise Distributions and Private Hypothesis Tests
(
Contributed talk
)
link »
SlidesLive Video »
In the setting of $f$-DP, we propose the concept \emph{canonical noise distribution} (CND) which captures whether an additive privacy mechanism is tailored for a given $f$, and give a construction of a CND for an arbitrary tradeoff function $f$. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private $p$-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data. We apply our techniques to difference of proportions testing.
|
Jordan Awan · Salil Vadhan 🔗 |
Tue 1:15 p.m. - 1:45 p.m.
|
Q&A for four contributed talks
(
Q&A
)
SlidesLive Video » |
🔗 |
Tue 1:45 p.m. - 2:30 p.m.
|
Panel
SlidesLive Video » |
Oluwaseyi Feyisetan · Helen Nissenbaum · Aaron Roth · Christine Task 🔗 |
Tue 2:30 p.m. - 2:40 p.m.
|
Closing
(
closing
)
SlidesLive Video » |
🔗 |
-
|
An automatic differentiation system for the age of differential privacy
(
Poster
)
SlidesLive Video » We introduce Tritium, an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML). Optimal noise calibration in this setting requires efficient Jacobian matrix computations and tight bounds on the L2-sensitivity. Our framework achieves these objectives by relying on a functional analysis-based method for sensitivity tracking, which we briefly outline. This approach interoperates naturally and seamlessly with static graph-based automatic differentiation, which enables order-of-magnitude improvements in compilation times compared to previous work. Moreover, we demonstrate that optimising the sensitivity of the entire computational graph at once yields substantially tighter estimates of the true sensitivity compared to interval bound propagation techniques. Our work naturally befits recent developments in DP such as individual privacy accounting, aiming to offer improved privacy-utility trade-offs, and represents a step towards the integration of accessible machine learning tooling with advanced privacy accounting systems. |
Dmitrii Usynin · Alexander Ziller · Moritz Knolle · Daniel Rueckert · Georgios Kaissis 🔗 |
-
|
Communication Efficient Federated Learning with Secure Aggregation and Differential Privacy
(
Poster
)
link »
SlidesLive Video »
Optimizing the \puc tradeoff is a key challenge for federated learning. Under distributed differential privacy (DP) via secure aggregation (SecAgg), we prove that the worst-case communication cost per client must be at least $\Omega\left( d \log \left( \frac{n^2\varepsilon^2}{d} \right) \right)$ to achieve $O\left( \frac{d}{n^2\varepsilon^2} \right)$ centralized error, which matches the error under central DP. Despite this bound, we leverage the near-sparse structure of model updates, evidenced through recent empirical studies, to obtain improved tradeoffs for distributed \DP. In particular, we leverage linear compression methods, namely sketching, to attain compression rates of up to $50\times$ with no significant decrease in model test accuracy achieving a noise multiplier $0.5$. Our work demonstrates that fundamental tradeoffs in differentially private federated learning can be drastically improved in practice.
|
Wei-Ning Chen · Christopher Choquette-Choo · Peter Kairouz 🔗 |
-
|
Realistic Face Reconstruction from Deep Embeddings
(
Poster
)
link »
SlidesLive Video » Modern face recognition systems use deep convolution neural networks to extract latent embeddings from face images. Since basic arithmetic operations on embeddings are needed to make comparisons, generic encryption schemes cannot be used. This leaves facial embedding unprotected and susceptible to privacy attacks that reconstruction facial identity. We propose a search algorithm on the latent vector space of StyleGAN to find a matching face. Our process yields latent vectors that generate face images that are high-resolution, realistic, and reconstruct relevant attributes of the original face. Further, we demonstrate that our process is capable of fooling FaceNet, a state-of-the-art face recognition system. |
Edward Vendrow · Joshua Vendrow 🔗 |
-
|
Certified Predictions using MPC-Friendly Publicly Verifiable Covertly Secure Commitments
(
Poster
)
link »
SlidesLive Video » We address the problem of efficiently and securely enabling certified predictions on deep learning models. This addresses the scenario where a party P1 owns a confidential model that has been certified by an authority to have a certain property e.g. fairness. Subsequently, another party P2 wants to perform a prediction on the model with an assurance that the certified model was used. We present a solution for this problem based on MPC commitments. Our constructions operate in the publicly verifiable covert (PVC) security model, which is a relaxation of the malicious model of MPC, appropriate in settings where P1 faces a reputational harm if caught cheating. We introduce the notion of a PVC commitment scheme and indexed hash functions to build commitment schemes tailored to the PVC framework, and propose constructions for both arithmetic and Boolean circuits that result in very efficient circuits. From a practical standpoint, our constructions for Boolean circuits are 60x faster to evaluate securely, and use 36x less communication than baseline methods based on hashing. Moreover, we show that our constructions are tight in terms of required non-linear operations, and present a technique to amplify the security properties of our constructions that allows to efficiently recover malicious guarantees with statistical security. |
Nitin Agrawal · James Bell · Matt Kusner 🔗 |
-
|
Mean Estimation with User-level Privacy under Data Heterogeneity
(
Poster
)
link »
SlidesLive Video » A key challenge for data analysis in the federated setting is that user data is heterogeneous, i.e., it cannot be assumed to be sampled from the same distribution. Further, in practice, different users may possess vastly different number of samples. In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We demonstrate near asymptotic optimality of our estimator among nearly unbiased estimators. In particular, while the optimal non-private estimator can be shown to be linear, we show that privacy constrains us to use a non-linear estimator. |
Rachel Cummings · Vitaly Feldman · Audra McMillan · Kunal Talwar 🔗 |
-
|
DP-SEP: Differentially private stochastic expectation propagation
(
Poster
)
link »
We are interested in privatizing an approximate posterior inference algorithm called Expectation Propagation (EP). EP approximates the posterior by iteratively refining approximations to the local likelihoods, and is known to provide better posterior uncertainties than those by variational inference. However, using EP for large-scale datasets imposes a challenge in terms of memory requirements as it needs to maintain each of the local approximates in memory. To overcome this problem, stochastic expectation propagation (SEP) was proposed, which only considers a unique local factor that captures the average effect of each likelihood term to the posterior and refines it in a way analogous to EP. Therefore in this work, we focus on developing a differentially private stochastic expectation propagation(DP-SEP) algorithm, which outputs differentially private natural parameters of the exponential-family posteriors in each step of SEP. |
Margarita Vinaroz · Mijung Park 🔗 |
-
|
Label Private Deep Learning Training based on Secure Multiparty Computation and Differential Privacy
(
Poster
)
link »
SlidesLive Video » Secure Multiparty Computation (MPC) is an invaluable tool for training machine learning models when the training data cannot be directly accessed by the model trainer. Unfortunately, complex algorithms, such as deep learning models, have their computational complexities increased by orders of magnitude when performed using MPC protocols. In this contribution, we study how to efficiently train an important class of machine learning problems by using MPC where features are known by one of the computing parties and only the labels are private. We propose new protocols combining differential privacy (DP) and MPC in order to privately and efficiently train a deep learning model in such scenario. More specifically, we release differentially private information during the MPC computation to dramatically reduce the training time. All released information idoes not compromise the privacy of the labels at the individual level. Our protocols can have running times that are orders of magnitude better than a straightforward use of MPC at a moderate cost in model accuracy. |
Sen Yuan · Milan Shen · Ilya Mironov · Anderson Nascimento 🔗 |
-
|
Private Confidence Sets
(
Poster
)
link »
SlidesLive Video »
We consider statistical inference under privacy constraints. In particular, we give differentially private algorithms for estimating coverage probabilities and computing valid confidence sets, and prove upper bounds on the error of our estimates and the length of our confidence sets. Our bounds apply to broad classes of data distributions and statistics of interest, and for fixed $\varepsilon$ we match the higher-order asymptotic accuracy of the standard (non-private) non-parametric bootstrap.
|
Karan Chadha · John Duchi · Rohith Kuditipudi 🔗 |
-
|
A Joint Exponential Mechanism for Differentially Private Top-k Set
(
Poster
)
link »
SlidesLive Video » We present a novel differentially private algorithm for releasing the set of k elements with the highest counts from a data domain of d elements. We define a ``joint'' instance of the exponential mechanism (EM) whose output space consists of all O(d^k) size-k subsets; yet, we are able to show how to sample from this EM in only time O(dk^3). Experiments suggest that this joint approach can yield utility improvements over the existing state of the art for small problem sizes. |
Andres Munoz Medina · Matthew Joseph · Jennifer Gillenwater · Monica Ribero Diaz 🔗 |
-
|
Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning
(
Poster
)
link »
SlidesLive Video » End-to-end encryption (E2EE) in messaging platforms enables people to securely and privately communicate with one another. Its widespread adoption however raised concerns that illegal content might now be shared undetected. Client-side scanning based on perceptual hashing has been recently proposed by governments and researchers to detect illegal content in E2EE communications. We propose the first framework to evaluate the robustness of perceptual hashing-based client-side scanning to detection avoidance attacks and show current systems to not be robust. We propose three adversarial attacks---a general black-box attack and two white-box attacks for discrete cosine transform-based algorithms--against perceptual hashing algorithms. In a large-scale evaluation, we show perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks in a black-box setting, with more than 99.9\% of images successfully attacked while preserving the content of the image. We further show several mitigation strategies, such as expanding the database with hashes of images modified using our attack, or increasing the detection threshold, to be ineffective against our attack. Taken together, our results shed serious doubts on the robustness of perceptual hashing-based client-side scanning mechanisms currently proposed by governments, organizations, and researchers around the world. |
Shubham Jain · Ana-Maria Cretu · Yves-Alexandre Montjoye 🔗 |
-
|
Characterizing and Improving MPC-based Private Inference for Transformer-based Models
(
Poster
)
link »
SlidesLive Video » Secure multi-party computation (MPC) is gaining popularity with the growing demand for privacy-preserving cloud services. While there has been plenty of attention to MPCs for convolution neural networks (CNNs), MPC-based private inference for Transformer models has not been studied in detail. This paper provides a characterization study of the performance overhead for running Transformer models with secure MPC, and proposes an optimization for embedding tables. Our study shows that Transformers introduce a couple of new challenges for MPC-based private inference: softmax and embedded tables. To address the overhead of embedding table accesses under MPC, we propose to use tensor-train (TT) decomposition, a mechanism that splits a large embedding tables into multiple smaller embedding tables. For the NLP workloads, the experiments show that the TT decomposition can speed up embedding table accesses by 2x with only a 1.19 drop in the masked-language model perplexity score. |
Yongqin Wang · Brian Knott · Murali Annavaram · Hsien-Hsin Lee 🔗 |
-
|
SoK: Privacy-preserving Clustering (Extended Abstract)
(
Poster
)
link »
SlidesLive Video » Clustering is a popular unsupervised machine learning technique that groups similar input elements into clusters. In many applications, sensitive information is clustered that should not be leaked. Moreover, nowadays it is often required to combine data from multiple sources to increase the quality of the analysis as well as to outsource complex computation to powerful cloud servers. This calls for efficient privacy-preserving clustering. In this work, we systematically analyze the state-of-the-art in privacy-preserving clustering. We implement and benchmark today's four most efficient fully private clustering protocols by Cheon et al. (SAC'19), Meng et al. (ArXiv'19), Mohassel et al. (PETS'20), and Bozdemir et al. (ASIACCS'21) with respect to communication, computation, and clustering quality. |
Helen Möllering · Hossein Yalame · Aditya Hegde · Thomas Schneider 🔗 |
-
|
Membership Inference Attacks Against NLP Classification Models
(
Poster
)
link »
SlidesLive Video » The success of natural language processing (NLP) is making NLP applications commonplace. Unfortunately, recent research has shown that privacy might be at stake given that these models are often trained on private user data. While privacy risks are demonstrated in text generation settings, privacy risks of the text classification settings, which subsume myriad downstream applications, are largely unexplored. In this work, we study the susceptibility of NLP classification models, used for text classification tasks, to membership inference (MI), which is a fundamental type of privacy leakage. We design a comprehensive suite of attacks to assess the risk of sample-level MI, as well as that of relatively unexplored user-level MI. We introduce novel user-level MI attacks that outperform the existing attacks and conduct experiments on Transformer-based and RNN-based NLP models. Our evaluations show that user-level MI is significantly stronger than sample-level MI. We further perform in-depth analyses showing the effect of various NLP-specific parameters on MI against NLP classification models. |
Virat Shejwalkar · Huseyin A Inan · Amir Houmansadr · Robert Sim 🔗 |
-
|
A Generic Hybrid 2PC Framework with Application to Private Inference of Unmodified Neural Networks (Extended Abstract)
(
Poster
)
link »
SlidesLive Video » We present a new framework for generic mixed-protocol secure two-party computation (2PC) and private evaluation of neural networks based on the recent MOTION framework (Braun et al., ePrint '20). We implement five different 2PC protocols in the semi-honest setting -- Yao's garbled circuits, arithmetic and Boolean variants of Goldreich-Micali-Wigderson (GMW), and two secret-sharing-based protocols from ABY2.0 (Patra et al., USENIX Security '21) -- together with 20 conversions among each other and new optimizations. We explore the feasibility of evaluating neural networks with 2PC without making modifications to their structure, and provide secure tensor data types and specialized building blocks for common tensor operations. By supporting the Open Neural Network Exchange (ONNX) file format, this yields an easy-to-use solution for privately evaluating neural networks, and is interoperable with industry-standard deep learning frameworks such as TensorFlow and PyTorch. By exploiting the networks' high-level structure and using common 2PC techniques, we obtain a performance that is comparable to that of recent, highly optimized works and significantly better than when using generic 2PC for low-level hybrid circuits. |
Lennart Braun · Thomas Schneider · Rosario Cammarota 🔗 |
-
|
ABY2.0: New Efficient Primitives for STPC with Applications to Privacy in Machine Learning (Extended Abstract)
(
Poster
)
link »
SlidesLive Video »
In this work, we improve semi-honest secure two-party computation (STPC) over rings, specially for privacy-preserving machine learning, with a focus on the efficiency of the online phase. We construct efficient protocols for several privacy-preserving machine learning (PPML) primitives such as scalar product, matrix multiplication, ReLU, and maxpool. The online communication of our scalar product is two ring elements {\em irrespective} of the vector dimension, which is a feature achieved for the first time in PPML literature. We implement and benchmark training and inference of Logistic Regression and Neural Networks over LAN and WAN networks. For training, we improve online runtime (both for LAN and WAN) over SecureML (Mohassel et al., IEEE S\&P'17) in the range $1.5\times$--$6.1\times$, while for inference, the improvements are in the range of $2.5\times$--$754.3\times$.
|
Arpita Patra · Hossein Yalame · Thomas Schneider · Ajith Suresh 🔗 |
-
|
Combining Public and Private Data
(
Poster
)
link »
SlidesLive Video » Differential privacy is widely adopted to provide provable privacy guarantees in data analysis. We consider the problem of combining public and private data (and, more generally, data with heterogeneous privacy needs) for estimating aggregate statistics. We introduce a mixed estimator of the mean optimized to minimize the variance. We argue that our mechanism is preferable to techniques that preserve the privacy of individuals by subsampling data proportionally to the privacy needs of users. Similarly, we present a mixed median estimator based on the exponential mechanism. We compare our mechanisms to the methods proposed in Jorgensen et al. [2015]. Our experiments provide empirical evidence that our mechanisms often outperform the baseline methods. |
Cecilia Ferrando · Jennifer Gillenwater · Alex Kulesza 🔗 |
-
|
Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods
(
Poster
)
link »
SlidesLive Video » We study private synthetic data generation for query release, where the goal is to construct a sanitized version of a sensitive dataset, subject to differential privacy, that approximately preserves the answers to a large collection of statistical queries. We first present an algorithmic framework that unifies a long line of iterative algorithms in the literature. Under this framework, we propose two new methods. The first method, private entropy projection (PEP), can be viewed as an advanced variant of MWEM that adaptively reuses past query measurements to boost accuracy. Our second method, generative networks with the exponential mechanism (GEM), circumvents computational bottlenecks in algorithms such as MWEM and PEP by optimizing over generative models parameterized by neural networks, which capture a rich family of distributions while enabling fast gradient-based optimization. We demonstrate that PEP and GEM empirically outperform existing algorithms. Furthermore, we show that GEM nicely incorporates prior information from public data while overcoming limitations of PMW^Pub, the existing state-of-the-art method that also leverages public data. |
Terrance Liu · Giuseppe Vietri · Steven Wu 🔗 |
-
|
Unsupervised Membership Inference Attacks Against Machine Learning Models
(
Poster
)
link »
SlidesLive Video » As a form of privacy leakage for machine learning (ML), membership inference (MI) attacks aim to infer whether given data samples have been used to train a target ML model. Existing state-of-the-art MI attacks in black-box settings adopt a so-called shadow model to perform transfer attacks. Such attacks achieve high inference accuracy but have many adversarial assumptions, such as having a dataset from the same distribution as the target model’s training data and knowledge of the target model structure. We propose a novel MI attack, called UMIA, which probes the target model in an unsupervised way without any shadow model. We relax all the adversarial assumptions above, demonstrating that MI attacks are applicable without any knowledge about the target model and its training set. We empirically show that, with far fewer adversarial assumptions and computational resources, UMIA can perform on bar with the state-of-the-art supervised MI attack. |
YUEFENG PENG 🔗 |
-
|
Population Level Privacy Leakage in Binary Classification wtih Label Noise
(
Poster
)
link »
SlidesLive Video » We study the privacy limitations of label differential privacy. Label differential privacy has emerged as an intermediate trust model between local and central differential privacy, where only the label of each training example is protected (and the features are assumed to be public). We show that the guarantees provided by label DP are significantly weaker than they appear, as an adversary can "un-noise" the perturbed labels. Formally we show that the privacy loss has a close connection with Jeffreys' divergence of the conditional distribution between positive and negative labels, which allows explicit formulation of the trade-off between utility and privacy in this setting. Our results suggest how to select public features that optimize this trade-off. But we still show that there is no free lunch --- instances where label differential privacy guarantees are strong are exactly those where a good classifier does not exist. We complement the negative results with a non-parametric estimator for the true privacy loss, and apply our techniques on large-scale benchmark data to demonstrate how toachieve a desired privacy protection. |
Róbert Busa-Fekete · Andres Munoz Medina · Umar Syed · Sergei Vassilvitskii 🔗 |
-
|
A Novel Self-Distillation Architecture to Defeat Membership Inference Attacks
(
Poster
)
link »
SlidesLive Video » Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models, which aim to distinguish training members from non-members by exploiting differential behavior of the models on member and non-member inputs. We propose a new framework to train privacy-preserving models that induces similar behavior on member and non-member inputs to mitigate practical membership inference attacks. Our framework, called SELENA, has two major components. The first component and the core of our defense, called Split-AI, is a novel ensemble architecture for training. We prove that our Split-AI architecture defends against a large family of membership inference attacks, however, it is susceptible to new adaptive attacks. Therefore, we use a second component in our framework called Self-Distillation to protect against such stronger attacks, which (self-)distills the training dataset through our Split-AI ensemble and has no reliance on external public datasets. We perform extensive experiments on major benchmark datasets and the results show that our approach achieves a better trade-off between membership privacy and utility compared to previous defenses. |
Xinyu Tang · Saeed Mahloujifar · Liwei Song · Virat Shejwalkar · Amir Houmansadr · Prateek Mittal 🔗 |
-
|
Enforcing fairness in private federated learning via the modified method of differential multipliers
(
Poster
)
link »
SlidesLive Video » Federated learning with differential privacy, or private federated learning, provides a strategy to train machine learning models while respecting users’ privacy. However, differential privacy can disproportionately degrade the performance of the models on under-represented groups, as these parts of the distribution are difficult to learn in the presence of noise. Existing approaches for enforcing fairness to machine learning models have considered the centralized setting, in which the algorithm has access to the users’ data. This paper introduces an algorithm to enforce group fairness in private federated learning, where users’ data does not leave their devices. First, the paper extends the modified method of differential multipliers to empirical risk minimization with fairness constraints, thus providing an algorithm to enforce fairness in the central setting. Then, this algorithm is extended to the private federated learning setting. The proposed algorithm, FPFL, is tested on a federated version of the Adult dataset and an “unfair” version of the FEMNIST dataset. The experiments on these datasets show how private federated learning accentuates unfairness in the trained models, and how FPFL is able to mitigate such unfairness. |
Borja Rodríguez Gálvez · Filip Granqvist · Rogier van Dalen · Matthew Seigel 🔗 |
-
|
Efficient passive membership inference attack in federated learning
(
Poster
)
link »
SlidesLive Video »
In cross-device federated learning (FL) setting, clients such as mobiles cooperate with the server to train a global machine learning model, while maintaining their data locally. However, recent work shows that client's private information can still be disclosed to an adversary who just eavesdrops the messages exchanged between the client and the server. For example, the adversary can infer whether the client owns a specific data instance, which is called a passive membership inference attack. In this paper, we propose a new passive inference attack that requires much less computation power and memory than existing methods. Our empirical results show that our attack achieves a higher accuracy on CIFAR100 dataset (mora than $4$ percentage points) with three orders of magnitude less memory space and five orders of magnitude less calculations.
|
CHUAN XU · Giovanni Neglia · Oualid ZARI 🔗 |
-
|
Interaction data are identifiable even across long periods of time
(
Poster
)
link »
SlidesLive Video »
Fine-grained records of people's interactions, both offline and online, are collected at a large scale. These data contain sensitive information about whom we meet, talk to, and when. We demonstrate here how people's interaction behavior is stable over long periods of time and can be used to identify individuals in anonymous datasets. Our attack learns the profile of an individual using geometric deep learning and triplet loss optimization. In a mobile phone metadata dataset of more than 40k people, it correctly identifies 52\% of individuals based on their $2$-hop interaction graph. We further show that the profiles learned by our method are stable over time and that 24\% of people are still identifiable after 20 weeks, thus making identification a real risk in practice. Our results provide strong evidence that disconnected and even re-pseudonymized interaction data can be linked together making them likely to be personal data under the European Union's General Data Protection Regulation.
|
Ana-Maria Cretu · Federico Monti · Stefano Marrone · Xiaowen Dong · Michael Bronstein · Yves-Alexandre Montjoye 🔗 |
-
|
Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing
(
Poster
)
link »
SlidesLive Video » Differentially private learning has seen limited success for deep learning models of text, resulting in a perception that differential privacy may be incompatible with the language model fine-tuning paradigm. We demonstrate that this perception is inaccurate and that with the right setup, high performing private models can be learned on moderately-sized corpora by directly fine-tuning with differentially private optimization.Our work highlights the important role of hyperparameters, task formulations, and pretrained models.Our analyses also show that the low performance of naive differentially private baselines in prior work is attributable to suboptimal choices in these factors.Empirical results reveal that differentially private optimization does not suffer from dimension-dependent performance degradation with pretrained models and achieves performance on-par with state-of-the-art private training procedures and strong non-private baselines. |
Xuechen (Chen) Li · Florian Tramer · Percy Liang · Tatsunori Hashimoto 🔗 |
-
|
Feature-level privacy loss modelling in differentially private machine learning
(
Poster
)
link »
We introduce Tritium, an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML). Optimal noise calibration in this setting requires efficient Jacobian matrix computations and tight bounds on the L2-sensitivity. Our framework achieves these objectives by relying on a functional analysis-based method for sensitivity tracking, which we briefly outline. This approach interoperates naturally and seamlessly with static graph-based automatic differentiation, which enables order-of-magnitude improvements in compilation times compared to previous work. Moreover, we demonstrate that optimising the sensitivity of the entire computational graph at once yields substantially tighter estimates of the true sensitivity compared to interval bound propagation techniques. Our work naturally befits recent developments in DP such as individual privacy accounting, aiming to offer improved privacy-utility trade-offs, and represents a step towards the integration of accessible machine learning tooling with advanced privacy accounting systems. |
Dmitrii Usynin · Alexander Ziller · Moritz Knolle · Daniel Rueckert · Georgios Kaissis 🔗 |
-
|
Opacus: User-Friendly Differential Privacy Library in PyTorch
(
Poster
)
link »
SlidesLive Video » We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at https://opacus.ai). Opacus is designed for simplicity, flexibility, and speed. It provides a simple and user-friendly API, and enables machine learning practitioners to make a training pipeline private by adding as little as two lines to their code. It supports a wide variety of layers, including multi-head attention, convolution, LSTM, and embedding, right out of the box, and it also provides the means for supporting other user-defined layers. Opacus computes batched per-sample gradients, providing better efficiency compared to the traditional “micro batch” approach. In this paper we present Opacus, detail the principles that drove its implementation and unique features, and compare its performance against other frameworks for differential privacy in ML. |
Ashkan Yousefpour · Igor Shilov · Alexandre Sablayrolles · Karthik Prasad · Mani Malek Esmaeili · John Nguyen · Sayan Ghosh · Akash Bharadwaj · Jessica Zhao · Graham Cormode · Ilya Mironov
|
-
|
Differential Privacy via Group Shuffling
(
Poster
)
link »
SlidesLive Video » The past decade has seen data privacy emerge as a fundamental and pressing issue. Among the tools developed to tackle it, differential privacy has emerged as a central and principled framework, with specific variants capturing various threat models. In particular, the recently proposed shuffle model of differential privacy allows for promising tradeoffs between accuracy and privacy. However, the shuffle model may not be suitable in all situations, as it relies on a distributed setting where all users can coordinate and trust (or simulate) a joint shuffling algorithm. To address this, we introduce a new model, the group shuffle model, in which users are partitioned into several groups, each group having its own local shuffler. We investigate the privacy/accuracy tradeoffs in our model, by comparing it to both the shuffle and local models of privacy, which it some sense interpolates between. In addition to general relations between group shuffle, shuffle, and local privacy, we provide a detailed comparison of the cost and benefit of the group shuffle model, by providing both upper and lower bounds for the specific task of binary summation. |
Amir Mohammad Abouei · Clement Canonne 🔗 |
-
|
Architecture Matters: Investigating the Influence of Differential Privacy on Neural Network Design
(
Poster
)
link »
SlidesLive Video » We explore the relationship between neural network architectures and model accuracy under differential privacy constraints. Our findings show that architectures that perform well without differential privacy, do not necessarily do so with differential privacy. This shows that extant knowledge on neural network architecture design cannot be seamlessly translated into the differential privacy context. Moreover, as neural architecture search consumes privacy budget, future research is required to better understand the relationship between neural network architectures and model accuracy to enable better architecture design choices under differential privacy constraints. |
Felix Morsbach 🔗 |
-
|
Reconstructing Training Data with Informed Adversaries
(
Poster
)
link »
SlidesLive Video » Given access to a machine learning model, can an adversary reconstruct the model’s training data? This work proposes a formal threat model to study this question, shows that reconstruction attacks are feasible in theory and in practice, and presents preliminary results assessing how different factors of standard machine learning pipelines affect the success of reconstruction. Finally, we empirically evaluate what levels of differential privacy suffice to prevent reconstruction attacks. |
Borja Balle · Giovanni Cherubin · Jamie Hayes 🔗 |
-
|
SSSE: Efficiently Erasing Samples from Trained Machine Learning Models
(
Poster
)
link »
The availability of large amounts of user-provided data has been key to the success of machine learning for many real-world tasks. Recently, an increasing awareness has emerged that users should be given more control about how their data is used. In particular, users should have the right to prohibit the use of their data for training machine learning systems, and to have it erased from already trained systems. While several sample erasure methods have been proposed, all of them have drawbacks which have prevented them from gaining widespread adoption. In this paper, we propose an efficient and effective algorithm, SSSE, for samples erasure that is applicable to a wide class of machine learning models. From a second-order analysis of the model's loss landscape we derive a closed-form update step of the model parameters that only requires access to the data to be erased, not to the original training set. Experiments on CelebFaces attributes (CelebA) and CIFAR10, show that in certain cases SSSE can erase samples almost as well as the optimal, yet impractical, gold standard of training a new model from scratch with only the permitted data. |
Alexandra Peste · Dan Alistarh · Christoph Lampert 🔗 |
-
|
Differentially Private Hamiltonian Monte Carlo
(
Poster
)
link »
We present DP-HMC, a variant of Hamiltonian Monte Carlo (HMC) that is differentially private (DP). We use the penalty algorithm of Yildirim and Ermis to make the acceptance test private, and add Gaussian noise to the gradients of the target distribution to make the HMC proposal private. Our main contribution is showing that DP-HMC has the correct invariant distribution, and is ergodic. We also compare DP-HMC with the existing penalty algorithm, as well as DP-SGLD and DP-SGNHT. |
Ossi Räisä · Antti Koskela · Antti Honkela 🔗 |
-
|
Zero Knowledge Arguments for Verifiable Sampling
(
Poster
)
link »
SlidesLive Video » In privacy-preserving machine learning, it is less obvious to verify correct behavior of participants because they are not supposed to reveal their inputs in cleartext to other participants. It is hence important to make federated machine learning robust against data poisoning and related attacks. While input data can be related to a distributed ledger (blockchain), a less studied input is formed by the random sampling parties perform. In this paper, we describe strategies based on zero knowledge proofs to allow parties to prove they perform sampling (and other computations) correctly. We sketch a number of alternative ways to implement our idea and provide some preliminary experimental results. |
César Sabater · Jan Ramon 🔗 |
-
|
Basil: A Fast and Byzantine-Resilient Approach for Decentralized Training
(
Poster
)
link »
SlidesLive Video »
Decentralized (i.e., serverless) learning across a large number of distributed nodes (e.g., mobile users) has seen a surge of recent interests. The key advantage of these setups is that they provide privacy for the local data of the users while not requiring a server for coordinating the training. They can, however, suffer substantially from potential Byzantine nodes in the network who can degrade the training performance. Detection and mitigation of Byzantine behaviors in a decentralized learning setting is a daunting task, especially when the data distribution at the users is heterogeneous. As our main contribution, we propose \texttt{Basil}, a fast and computationally efficient Byzantine robust algorithm for decentralized training systems, which leverages a novel sequential, memory assisted and performance based criteria for training over a logical ring while filtering the Byzantine users. In the IID dataset distribution setting, we provide the theoretical convergence guarantees of \texttt{Basil}, demonstrating its linear convergence rate. Furthermore, for the IID setting, we experimentally demonstrate that \texttt{Basil} is robust to various Byzantine attacks, including the strong Hidden attack, while providing up to ${\sim}16 \%$ higher test accuracy over the state-of-the-art Byzantine-resilient decentralized learning approach. Additionally, we generalize \texttt{Basil} to the non-IID dataset distribution setting by proposing Anonymous Cyclic Data Sharing (ACDS), a technique that allows each node to anonymously share a random fraction of its local non-sensitive dataset (e.g., landmarks images) with all other nodes. We demonstrate that \texttt{Basil} alongside ACDS with only $5\%$ data sharing provides effective toleration of Byzantine nodes, unlike the state-of-the-art Byzantine robust algorithm that completely fails in the heterogeneous data setting.
|
Ahmed Elkordy · Saurav Prakash · Salman Avestimehr 🔗 |
-
|
Canonical Noise Distributions and Private Hypothesis Tests
(
Poster
)
link »
SlidesLive Video »
In the setting of $f$-DP, we propose the concept \emph{canonical noise distribution} (CND) which captures whether an additive privacy mechanism is tailored for a given $f$, and give a construction of a CND for an arbitrary tradeoff function $f$. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private $p$-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data. We apply our techniques to difference of proportions testing.
|
Jordan Awan · Salil Vadhan 🔗 |
-
|
Privacy-Aware Rejection Sampling
(
Poster
)
link »
SlidesLive Video »
Differential privacy (DP) offers strong protection against adversaries with arbitrary side-information and computational power. However, many implementations of DP mechanisms leave themselves vulnerable to side channel attacks, such as timing attacks. As many privacy mechanisms, such as the exponential mechanism, do not lend themselves to easy implementations, when sampling methods such as MCMC or rejection sampling are used, the runtime can leak privacy. In this work, we quantify the privacy cost due to the runtime of a rejection sampler in terms of $(\epsilon,\delta)$-DP. We also propose three modifications to the rejection sampling algorithm, to protect against timing attacks by making the runtime independent of the data. We also use our techniques to develop an adaptive rejection sampler for log-Holder densities, which also has data-independent runtime.
|
Jordan Awan · Vinayak Rao 🔗 |
-
|
Reconstructing Test Labels from Noisy Loss Scores (Extended Abstract)
(
Poster
)
link »
SlidesLive Video » Label inference was recently introduced as the problem of reconstructing the ground truth labels of a private dataset from just the (possibly perturbed) cross-entropy loss scores evaluated at carefully crafted prediction vectors. In this paper, we generalize this result to provide necessary and sufficient conditions under which label inference is possible from a broad class of loss functions. We show that for many commonly used loss functions, including linearly decomposable losses, some Bregman divergence-based losses and when common activation functions are used, it is possible to design such attacks for arbitrary noise levels. We demonstrate that these attacks can also be carried out through a lightweight augmentation to any neural network model, enabling the adversary to make these attacks look benign. Our results call to attention these vulnerabilities which might be currently under silent exploitation. Armed with this information, individuals and organizations, which vend these seemingly innocuous aggregate metrics from their classification models, can grasp the potential scope of the resulting information leakage. |
Abhinav Aggarwal · Shiva Kasiviswanathan · Zekun Xu · Oluwaseyi Feyisetan · Nathanael Teissier 🔗 |
-
|
Understanding Training-Data Leakage from Gradients in Neural Networks for ImageClassifications
(
Poster
)
link »
SlidesLive Video » Federated learning of deep learning models for supervised tasks, e.g. image classification and segmentation, has found many applications: for example in human-in-the-loop tasks such as film post-production where it enables sharing of domain expertise of human artists in an efficient and effective fashion. In many such applications, we need to protect the training data from being leaked when gradients are shared in the training process due to IP or privacy concerns. Recent works have demonstrated that it is possible to reconstruct the training data from gradients for an image-classification model when its architecture is known. However, there is still an incomplete theoretical understanding of the efficacy and failure of such attacks. In this paper, we analyse the source of training-data leakage from gradients. We formulate the problem of training data reconstruction as solving an optimisation problem iteratively for each layer. The layer-wise objective function is primarily defined by weights and gradients from the current layer as well as the output from the reconstruction of the subsequent layer, but it might also involve a ‘pull-back’ constraint from the preceding layer. Training data can be reconstructed when we solve the problem backward from the output of the network through each layer. Based on this formulation, we are able to attribute the potential leakage of the training data in a deep network to its architecture. We also propose a metric to measure the level of security of a deep learning model against gradient-based attacks on the training data. |
Cangxiong Chen · Neill Campbell 🔗 |
-
|
Sample-and-threshold differential privacy: Histograms and applications
(
Poster
)
link »
SlidesLive Video »
Federated analytics aims to compute accurate statistics from distributed datasets. A "Differential Privacy" (DP) guarantee is usually desired by the users of the devices storing the data. In this work, we prove a strong $(\epsilon, \delta)$-DP guarantee for a highly practical sampling-based procedure to derive histograms. We also provide accuracy guarantees and show how to apply the procedure to estimate quantiles and modes.
|
Graham Cormode 🔗 |
-
|
Tight Accounting in the Shuffle Model of Differential Privacy
(
Poster
)
link »
SlidesLive Video »
Shuffle model of differential privacy is a novel distributed privacy model based on a combination of local privacy mechanisms and a trusted shuffler. It has been shown that the additional randomisation provided by the shuffler improves privacy bounds compared to the purely local mechanisms. Accounting tight bounds, especially for multi-message protocols, is complicated by the complexity brought by the shuffler. The recently proposed Fourier Accountant for evaluating $(\varepsilon,\delta)$-differential privacy guarantees has been shown to give tighter bounds than commonly used methods for non-adaptive compositions of various complex mechanisms. In this paper we show how to compute tight privacy bounds using the Fourier Accountant for multi-message versions of several ubiquitous mechanisms in the shuffle model and demonstrate looseness of the existing bounds in the literature.
|
Antti Koskela · Mikko Heikkilä · Antti Honkela 🔗 |
Author Information
Yu-Xiang Wang (UC Santa Barbara)
Borja Balle (DeepMind)
Giovanni Cherubin (Alan Turing Institute)
Kamalika Chaudhuri (UCSD)
Antti Honkela (University of Helsinki)
Jonathan Lebensold (McGill University)
Casey Meehan (University of California, San Diego)
Mi Jung Park (University of British Columbia)
Adrian Weller (Cambridge, Alan Turing Institute)
Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, where he is also a Turing Fellow leading work on safe and ethical AI. He is a Principal Research Fellow in Machine Learning at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. He serves on several boards including the Centre for Data Ethics and Innovation. Previously, Adrian held senior roles in finance.
Yuqing Zhu (University of California Santa Barbara)
More from the Same Authors
-
2021 : Reconstructing Training Data with Informed Adversaries »
Borja Balle · Giovanni Cherubin · Jamie Hayes -
2021 : Differentially Private Hamiltonian Monte Carlo »
Ossi Räisä · Antti Koskela · Antti Honkela -
2021 : Tight Accounting in the Shuffle Model of Differential Privacy »
Antti Koskela · Mikko Heikkilä · Antti Honkela -
2021 : Instance-dependent Offline Reinforcement Learning: From tabular RL to linear MDPs »
Ming Yin · Yu-Xiang Wang -
2022 Poster: Scalable Infomin Learning »
Yanzhi Chen · weihao sun · Yingzhen Li · Adrian Weller -
2022 : Generalized PTR: User-Friendly Recipes for Data-Adaptive Algorithms with Differential Privacy »
Rachel Redberg · Yuqing Zhu · Yu-Xiang Wang -
2022 : VOTING-BASED APPROACHES FOR DIFFERENTIALLY PRIVATE FEDERATED LEARNING »
Yuqing Zhu · Xiang Yu · Yi-Hsuan Tsai · Francesco Pittaluga · Masoud Faraki · Manmohan Chandraker · Yu-Xiang Wang -
2022 : The Interpolated MVU Mechanism For Communication-efficient Private Federated Learning »
Chuan Guo · Kamalika Chaudhuri · Pierre STOCK · Mike Rabbat -
2022 : Offline Reinforcement Learning with Closed-Form Policy Improvement Operators »
Jiachen Li · Edwin Zhang · Ming Yin · Qinxun Bai · Yu-Xiang Wang · William Yang Wang -
2022 : Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data »
Sunil Madhow · Dan Qiao · Yu-Xiang Wang -
2022 : Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation »
Dan Qiao · Yu-Xiang Wang -
2022 : Forgetting Data from Pre-trained GANs »
Zhifeng Kong · Kamalika Chaudhuri -
2022 : Individual Privacy Accounting with Gaussian Differential Privacy »
Antti Koskela · Marlon Tobaben · Antti Honkela -
2022 : Differentially Private Gradient Boosting on Linear Learners for Tabular Data »
Saeyoung Rho · Shuai Tang · Sergul Aydore · Michael Kearns · Aaron Roth · Yu-Xiang Wang · Steven Wu · Cedric Archambeau -
2022 : Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2022 : Conformal Prediction for Resource Prioritisation in Predicting Rare and Dangerous Outcomes »
Varun Babbar · Umang Bhatt · Miri Zilka · Adrian Weller -
2022 : Panel Discussion »
Kamalika Chaudhuri · Been Kim · Dorsa Sadigh · Huan Zhang · Linyi Li -
2022 : Contributed Talk: Differentially Private Bias-Term only Fine-tuning of Foundation Models »
Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis -
2022 : Panel on Privacy and Security in Machine Learning Systems »
Graham Cormode · Borja Balle · Yu-Xiang Wang · Alejandro Saucedo · Neil Lawrence -
2022 : Invited Talk: Kamalika Chaudhuri »
Kamalika Chaudhuri -
2022 : Practical differential privacy »
Yu-Xiang Wang · Fariba Yousefi -
2022 : Practical differential privacy »
Yu-Xiang Wang -
2022 : Noise-Aware Statistical Inference with Differentially Private Synthetic Data »
Ossi Räisä · Joonas Jälkö · Antti Honkela · Samuel Kaski -
2022 Poster: Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off »
Mateo Espinosa Zarlenga · Pietro Barbiero · Gabriele Ciravegna · Giuseppe Marra · Francesco Giannini · Michelangelo Diligenti · Zohreh Shams · Frederic Precioso · Stefano Melacci · Adrian Weller · Pietro Lió · Mateja Jamnik -
2022 Poster: SeqPATE: Differentially Private Text Generation via Knowledge Distillation »
Zhiliang Tian · Yingxiu Zhao · Ziyue Huang · Yu-Xiang Wang · Nevin L. Zhang · He He -
2022 Poster: Differentially Private Linear Sketches: Efficient Implementations and Applications »
Fuheng Zhao · Dan Qiao · Rachel Redberg · Divyakant Agrawal · Amr El Abbadi · Yu-Xiang Wang -
2022 Poster: Chefs' Random Tables: Non-Trigonometric Random Features »
Valerii Likhosherstov · Krzysztof M Choromanski · Kumar Avinava Dubey · Frederick Liu · Tamas Sarlos · Adrian Weller -
2022 Poster: A Survey and Datasheet Repository of Publicly Available US Criminal Justice Datasets »
Miri Zilka · Bradley Butcher · Adrian Weller -
2022 Poster: Optimal Dynamic Regret in LQR Control »
Dheeraj Baby · Yu-Xiang Wang -
2021 Workshop: Human Centered AI »
Michael Muller · Plamen P Angelov · Shion Guha · Marina Kogan · Gina Neff · Nuria Oliver · Manuel Rodriguez · Adrian Weller -
2021 Workshop: AI for Science: Mind the Gaps »
Payal Chandak · Yuanqi Du · Tianfan Fu · Wenhao Gao · Kexin Huang · Shengchao Liu · Ziming Liu · Gabriel Spadon · Max Tegmark · Hanchen Wang · Adrian Weller · Max Welling · Marinka Zitnik -
2021 Poster: Understanding Instance-based Interpretability of Variational Auto-Encoders »
Zhifeng Kong · Kamalika Chaudhuri -
2021 Poster: Consistent Non-Parametric Methods for Maximizing Robustness »
Robi Bhattacharjee · Kamalika Chaudhuri -
2020 Workshop: Privacy Preserving Machine Learning - PriML and PPML Joint Edition »
Borja Balle · James Bell · Aurélien Bellet · Kamalika Chaudhuri · Adria Gascon · Antti Honkela · Antti Koskela · Casey Meehan · Olga Ohrimenko · Mi Jung Park · Mariana Raykova · Mary Anne Smart · Yu-Xiang Wang · Adrian Weller -
2020 Poster: Ode to an ODE »
Krzysztof Choromanski · Jared Quincy Davis · Valerii Likhosherstov · Xingyou Song · Jean-Jacques Slotine · Jacob Varley · Honglak Lee · Adrian Weller · Vikas Sindhwani -
2020 Poster: A Closer Look at Accuracy vs. Robustness »
Yao-Yuan Yang · Cyrus Rashtchian · Hongyang Zhang · Russ Salakhutdinov · Kamalika Chaudhuri -
2020 Poster: Privacy Amplification via Random Check-Ins »
Borja Balle · Peter Kairouz · Brendan McMahan · Om Thakkar · Abhradeep Guha Thakurta -
2020 Poster: Improving Sparse Vector Technique with Renyi Differential Privacy »
Yuqing Zhu · Yu-Xiang Wang -
2019 : Audrey Durand, Douwe Kiela, Kamalika Chaudhuri moderated by Yann Dauphin »
Audrey Durand · Kamalika Chaudhuri · Yann Dauphin · Orhan Firat · Dilan Gorur · Douwe Kiela -
2019 : Kamalika Chaudhuri - A Three Sample Test to Detect Data Copying in Generative Models »
Kamalika Chaudhuri -
2019 : Poster Session »
Clement Canonne · Kwang-Sung Jun · Seth Neel · Di Wang · Giuseppe Vietri · Liwei Song · Jonathan Lebensold · Huanyu Zhang · Lovedeep Gondara · Ang Li · FatemehSadat Mireshghallah · Jinshuo Dong · Anand D Sarwate · Antti Koskela · Joonas Jälkö · Matt Kusner · Dingfan Chen · Mi Jung Park · Ashwin Machanavajjhala · Jayashree Kalpathy-Cramer · · Vitaly Feldman · Andrew Tomkins · Hai Phan · Hossein Esfandiari · Mimansa Jaiswal · Mrinank Sharma · Jeff Druce · Casey Meehan · Zhengli Zhao · Hsiang Hsu · Davis Railsback · Abraham Flaxman · · Julius Adebayo · Aleksandra Korolova · Jiaming Xu · Naoise Holohan · Samyadeep Basu · Matthew Joseph · My Thai · Xiaoqian Yang · Ellen Vitercik · Michael Hutchinson · Chenghong Wang · Gregory Yauney · Yuchao Tao · Chao Jin · Si Kai Lee · Audra McMillan · Rauf Izmailov · Jiayi Guo · Siddharth Swaroop · Tribhuvanesh Orekondy · Hadi Esmaeilzadeh · Kevin Procopio · Alkis Polyzotis · Jafar Mohammadi · Nitin Agrawal -
2019 Workshop: Privacy in Machine Learning (PriML) »
Borja Balle · Kamalika Chaudhuri · Antti Honkela · Antti Koskela · Casey Meehan · Mi Jung Park · Mary Anne Smart · Mary Anne Smart · Adrian Weller -
2019 : Poster Session »
Jonathan Scarlett · Piotr Indyk · Ali Vakilian · Adrian Weller · Partha P Mitra · Benjamin Aubin · Bruno Loureiro · Florent Krzakala · Lenka Zdeborová · Kristina Monakhova · Joshua Yurtsever · Laura Waller · Hendrik Sommerhoff · Michael Moeller · Rushil Anirudh · Shuang Qiu · Xiaohan Wei · Zhuoran Yang · Jayaraman Thiagarajan · Salman Asif · Michael Gillhofer · Johannes Brandstetter · Sepp Hochreiter · Felix Petersen · Dhruv Patel · Assad Oberai · Akshay Kamath · Sushrut Karmalkar · Eric Price · Ali Ahmed · Zahra Kadkhodaie · Sreyas Mohan · Eero Simoncelli · Carlos Fernandez-Granda · Oscar Leong · Wesam Sakla · Rebecca Willett · Stephan Hoyer · Jascha Sohl-Dickstein · Sam Greydanus · Gauri Jagatap · Chinmay Hegde · Michael Kellman · Jonathan Tamir · Nouamane Laanait · Ousmane Dia · Mirco Ravanelli · Jonathan Binas · Negar Rostamzadeh · Shirin Jalali · Tiantian Fang · Alex Schwing · Sébastien Lachapelle · Philippe Brouillard · Tristan Deleu · Simon Lacoste-Julien · Stella Yu · Arya Mazumdar · Ankit Singh Rawat · Yue Zhao · Jianshu Chen · Xiaoyang Li · Hubert Ramsauer · Gabrio Rizzuti · Nikolaos Mitsakos · Dingzhou Cao · Thomas Strohmer · Yang Li · Pei Peng · Gregory Ongie -
2019 Workshop: Workshop on Human-Centric Machine Learning »
Plamen P Angelov · Nuria Oliver · Adrian Weller · Manuel Rodriguez · Isabel Valera · Silvia Chiappa · Hoda Heidari · Niki Kilbertus -
2019 Poster: Privacy Amplification by Mixing and Diffusion Mechanisms »
Borja Balle · Gilles Barthe · Marco Gaboardi · Joseph Geumlek -
2019 Poster: Differentially Private Markov Chain Monte Carlo »
Mikko Heikkilä · Joonas Jälkö · Onur Dikmen · Antti Honkela -
2019 Spotlight: Differentially Private Markov Chain Monte Carlo »
Mikko Heikkilä · Joonas Jälkö · Onur Dikmen · Antti Honkela -
2019 Poster: The Label Complexity of Active Learning from Observational Data »
Songbai Yan · Kamalika Chaudhuri · Tara Javidi -
2019 Poster: Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models »
Yunfei Teng · Wenbo Gao · François Chalus · Anna Choromanska · Donald Goldfarb · Adrian Weller -
2019 Poster: Capacity Bounded Differential Privacy »
Kamalika Chaudhuri · Jacob Imola · Ashwin Machanavajjhala -
2018 : Invited talk 3: Challenges in the Privacy-Preserving Analysis of Structured Data »
Kamalika Chaudhuri -
2018 : Plenary Talk 2 »
Kamalika Chaudhuri -
2018 Workshop: Machine Learning Open Source Software 2018: Sustainable communities »
Heiko Strathmann · Viktor Gal · Ryan Curtin · Antti Honkela · Sergey Lisitsyn · Cheng Soon Ong -
2018 Workshop: Privacy Preserving Machine Learning »
Adria Gascon · Aurélien Bellet · Niki Kilbertus · Olga Ohrimenko · Mariana Raykova · Adrian Weller -
2018 Workshop: Workshop on Security in Machine Learning »
Nicolas Papernot · Jacob Steinhardt · Matt Fredrikson · Kamalika Chaudhuri · Florian Tramer -
2018 Poster: Geometrically Coupled Monte Carlo Sampling »
Mark Rowland · Krzysztof Choromanski · François Chalus · Aldo Pacchiano · Tamas Sarlos · Richard Turner · Adrian Weller -
2018 Spotlight: Geometrically Coupled Monte Carlo Sampling »
Mark Rowland · Krzysztof Choromanski · François Chalus · Aldo Pacchiano · Tamas Sarlos · Richard Turner · Adrian Weller -
2018 Poster: Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences »
Borja Balle · Gilles Barthe · Marco Gaboardi -
2017 : Invited talk: Differential privacy and Bayesian learning »
Antti Honkela -
2017 : Poster Session (encompasses coffee break) »
Beidi Chen · Borja Balle · Daniel Lee · iuri frosio · Jitendra Malik · Jan Kautz · Ke Li · Masashi Sugiyama · Miguel A. Carreira-Perpinan · Ramin Raziperchikolaei · Theja Tulabandhula · Yung-Kyun Noh · Adams Wei Yu -
2017 : Invited talk: Challenges for Transparency »
Adrian Weller -
2017 : Analyzing Robustness of Nearest Neighbors to Adversarial Examples »
Kamalika Chaudhuri -
2017 : Closing remarks »
Adrian Weller -
2017 Symposium: Kinds of intelligence: types, tests and meeting the needs of society »
José Hernández-Orallo · Zoubin Ghahramani · Tomaso Poggio · Adrian Weller · Matthew Crosby -
2017 Poster: From Parity to Preference-based Notions of Fairness in Classification »
Muhammad Bilal Zafar · Isabel Valera · Manuel Rodriguez · Krishna Gummadi · Adrian Weller -
2017 Poster: Renyi Differential Privacy Mechanisms for Posterior Sampling »
Joseph Geumlek · Shuang Song · Kamalika Chaudhuri -
2017 Poster: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Spotlight: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Poster: The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings »
Krzysztof Choromanski · Mark Rowland · Adrian Weller -
2017 Poster: Uprooting and Rerooting Higher-Order Graphical Models »
Mark Rowland · Adrian Weller -
2017 Poster: Hierarchical Methods of Moments »
Matteo Ruffini · Guillaume Rabusseau · Borja Balle -
2017 Poster: Multitask Spectral Learning of Weighted Automata »
Guillaume Rabusseau · Borja Balle · Joelle Pineau -
2017 Poster: Differentially private Bayesian learning on distributed data »
Mikko Heikkilä · Eemil Lagerspetz · Samuel Kaski · Kana Shimizu · Sasu Tarkoma · Antti Honkela -
2017 Tutorial: Differentially Private Machine Learning: Theory, Algorithms and Applications »
Kamalika Chaudhuri · Anand D Sarwate -
2016 Workshop: Private Multi-Party Machine Learning »
Borja Balle · Aurélien Bellet · David Evans · Adrià Gascón -
2016 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Adrian Weller · David Duvenaud · Jacob Steinhardt · Percy Liang -
2016 Symposium: Machine Learning and the Law »
Adrian Weller · Thomas D. Grant · Conrad McDonnell · Jatinder Singh -
2016 Poster: Active Learning from Imperfect Labelers »
Songbai Yan · Kamalika Chaudhuri · Tara Javidi -
2015 : Genome-wide modelling of transcription kinetics reveals patterns of RNA production delays »
Antti Honkela -
2015 : Kamalika Chaudhuri »
Kamalika Chaudhuri -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Symposium: Algorithms Among Us: the Societal Impacts of Machine Learning »
Michael A Osborne · Adrian Weller · Murray Shanahan -
2015 Poster: Active Learning from Weak and Strong Labelers »
Chicheng Zhang · Kamalika Chaudhuri -
2015 Poster: Spectral Learning of Large Structured HMMs for Comparative Epigenomics »
Chicheng Zhang · Jimin Song · Kamalika Chaudhuri · Kevin Chen -
2015 Poster: Convergence Rates of Active Learning for Maximum Likelihood Estimation »
Kamalika Chaudhuri · Sham Kakade · Praneeth Netrapalli · Sujay Sanghavi -
2014 Poster: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara -
2014 Oral: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara -
2014 Poster: Beyond Disagreement-Based Agnostic Active Learning »
Chicheng Zhang · Kamalika Chaudhuri -
2014 Poster: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Spotlight: Beyond Disagreement-Based Agnostic Active Learning »
Chicheng Zhang · Kamalika Chaudhuri -
2014 Spotlight: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2013 Workshop: Machine Learning Open Source Software: Towards Open Workflows »
Antti Honkela · Cheng Soon Ong -
2013 Poster: A Stability-based Validation Procedure for Differentially Private Machine Learning »
Kamalika Chaudhuri · Staal A Vinterbo -
2012 Poster: Near-optimal Differentially Private Principal Components »
Kamalika Chaudhuri · Anand D Sarwate · Kaushik Sinha -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2010 Poster: Rates of convergence for the cluster tree »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2008 Poster: Privacy-preserving logistic regression »
Kamalika Chaudhuri · Claire Monteleoni