Timezone: »
Poster
Differentially Private Distributed Data Summarization under Covariate Shift
Kanthi Sarpatwar · Karthikeyan Shanmugam · Venkata Sitaramagiridharganesh Ganapavarapu · Ashish Jagmohan · Roman Vaculin
Thu Dec 12 10:45 AM  12:45 PM (PST) @ East Exhibition Hall B + C #89
We envision Artificial Intelligence marketplaces to be platforms where consumers, with very less data for a target task, can obtain a relevant model by accessing many private data sources with vast number of data samples. One of the key challenges is to construct a training dataset that matches a target task without compromising on privacy of the data sources. To this end, we consider the following distributed data summarizataion problem. Given K private source datasets denoted by $[D_i]_{i\in [K]}$ and a small target validation set $D_v$, which may involve a considerable covariate shift with respect to the sources, compute a summary dataset $D_s\subseteq \bigcup_{i\in [K]} D_i$ such that its statistical distance from the validation dataset $D_v$ is minimized. We use the popular Maximum Mean Discrepancy as the measure of statistical distance. The nonprivate problem has received considerable attention in prior art, for example in prototype selection (Kim et al., NIPS 2016). Our work is the first to obtain strong differential privacy guarantees while ensuring the quality guarantees of the nonprivate version. We study this problem in a Parsimonious Curator Privacy Model, where a trusted curator coordinates the summarization process while minimizing the amount of private information accessed. Our central result is a novel protocol that (a) ensures the curator does not access more than $O(K^{\frac{1}{3}}D_s + D_v)$ points (b) has formal privacy guarantees on the leakage of information between the data owners and (c) closely matches the best known nonprivate greedy algorithm. Our protocol uses two hash functions, one inspired by the RahimiRecht random features method and the second leverages state of the art differential privacy mechanisms. We introduce a novel ``noiseless'' differentially private auctioning protocol, which may be of independent interest. Apart from theoretical guarantees, we demonstrate the efficacy of our protocol using realworld datasets.
Author Information
Kanthi Sarpatwar (IBM T. J. Watson Research Center)
Karthikeyan Shanmugam (IBM Research, NY)
Venkata Sitaramagiridharganesh Ganapavarapu (IBM Research)
Ashish Jagmohan (IBM Research)
Roman Vaculin (IBM Research)
More from the Same Authors

2021 : Math Programming based Reinforcement Learning for MultiEchelon Inventory Management »
Pavithra Harsha · Ashish Jagmohan · Jayant Kalagnanam · Brian Quanz · Divya Singhvi 
2022 Poster: Is this the Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations »
Amit Dhurandhar · Karthikeyan Natesan Ramamurthy · Karthikeyan Shanmugam 
2021 Poster: CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions »
Isha Puri · Amit Dhurandhar · Tejaswini Pedapati · Karthikeyan Shanmugam · Dennis Wei · Kush Varshney 
2021 Poster: FiniteSample Analysis of OffPolicy TDLearning via Generalized Bellman Operators »
Zaiwei Chen · Siva Theja Maguluri · Sanjay Shakkottai · Karthikeyan Shanmugam 
2021 Poster: Scalable Intervention Target Estimation in Linear Models »
Burak Varici · Karthikeyan Shanmugam · Prasanna Sattigeri · Ali Tajer 
2020 Poster: Active Structure Learning of Causal DAGs via Directed Clique Trees »
Chandler Squires · Sara Magliacane · Kristjan Greenewald · Dmitriy Katz · Murat Kocaoglu · Karthikeyan Shanmugam 
2020 Poster: Causal Discovery from Soft Interventions with Unknown Targets: Characterization and Learning »
Amin Jaber · Murat Kocaoglu · Karthikeyan Shanmugam · Elias Bareinboim 
2020 Poster: Mix and Match: An Optimistic TreeSearch Approach for Learning Models from Mixture Distributions »
Matthew Faw · Rajat Sen · Karthikeyan Shanmugam · Constantine Caramanis · Sanjay Shakkottai 
2020 Poster: Learning Global Transparent Models consistent with Local Contrastive Explanations »
Tejaswini Pedapati · Avinash Balakrishnan · Karthikeyan Shanmugam · Amit Dhurandhar 
2020 Poster: FiniteSample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes »
Zaiwei Chen · Siva Theja Maguluri · Sanjay Shakkottai · Karthikeyan Shanmugam 
2019 Poster: Sample Efficient Active Learning of Causal Trees »
Kristjan Greenewald · Dmitriy Katz · Karthikeyan Shanmugam · Sara Magliacane · Murat Kocaoglu · Enric BoixAdsera · Guy Bresler 
2019 Poster: Characterization and Learning of Causal Graphs with Latent Variables from Soft Interventions »
Murat Kocaoglu · Amin Jaber · Karthikeyan Shanmugam · Elias Bareinboim 
2018 Poster: Improving Simple Models with Confidence Profiles »
Amit Dhurandhar · Karthikeyan Shanmugam · Ronny Luss · Peder A Olsen 
2018 Poster: Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives »
Amit Dhurandhar · PinYu Chen · Ronny Luss · ChunChen Tu · Paishun Ting · Karthikeyan Shanmugam · Payel Das 
2017 Poster: Experimental Design for Learning Causal Graphs with Latent Variables »
Murat Kocaoglu · Karthikeyan Shanmugam · Elias Bareinboim 
2017 Poster: ModelPowered Conditional Independence Test »
Rajat Sen · Ananda Theertha Suresh · Karthikeyan Shanmugam · Alex Dimakis · Sanjay Shakkottai 
2015 Poster: Learning Causal Graphs with Small Interventions »
Karthikeyan Shanmugam · Murat Kocaoglu · Alex Dimakis · Sriram Vishwanath 
2014 Poster: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alex Dimakis · Adam Klivans 
2014 Poster: On the Information Theoretic Limits of Learning Ising Models »
Rashish Tandon · Karthikeyan Shanmugam · Pradeep Ravikumar · Alex Dimakis 
2014 Oral: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alex Dimakis · Adam Klivans