Timezone: »
Poster
The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
Adam Smith · Shuang Song · Abhradeep Guha Thakurta
We revisit the problem of counting the number of distinct elements $\dist$ in a data stream $D$, over a domain $[u]$. We propose an $(\epsilon,\delta)$-differentially private algorithm that approximates $\dist$ within a factor of $(1\pm\gamma)$, and with additive error of $O(\sqrt{\ln(1/\delta)}/\epsilon)$, using space $O(\ln(\ln(u)/\gamma)/\gamma^2)$. We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive error. Our additive error guarantee is optimal up to a factor of $O(\sqrt{\ln(1/\delta)})$, and the space bound is optimal up to a factor of $O\left(\min\left\{\ln\left(\frac{\ln(u)}{\gamma}\right), \frac{1}{\gamma^2}\right\}\right)$. We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudorandom functions and appealing to a computational variant of differential privacy, SIM-CDP.
Our algorithm is built on top of the celebrated Flajolet-Martin (FM) sketch. We show that FM-sketch is differentially private as is, as long as there are $\approx \sqrt{\ln(1/\delta)}/(\epsilon\gamma)$ distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of $k$ i.i.d. random variables is statistically close (in the sense of $\epsilon$-differential privacy) to the maximum of $(k+1)$ i.i.d. samples from the same distribution, as long as $k=\Omega\left(\frac{1}{\epsilon}\right)$.
Finally, experiments show that our algorithms introduces error within an order of magnitude of the non-private analogues for streams with thousands of distinct elements, even while providing strong privacy guarantee ($\eps\leq 1$).
Author Information
Adam Smith (Boston University)
Shuang Song (Google)
Abhradeep Guha Thakurta (Google Research - Brain Team and UC Santa Cruz)
More from the Same Authors
-
2020 : Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems »
Shuang Song -
2021 Spotlight: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2022 Poster: When Does Differentially Private Learning Not Suffer in High Dimensions? »
Xuechen Li · Daogao Liu · Tatsunori Hashimoto · Huseyin A. Inan · Janardhan Kulkarni · Yin-Tat Lee · Abhradeep Guha Thakurta -
2022 Poster: Improved Differential Privacy for SGD via Optimal Private Linear Operators on Adaptive Streams »
Sergey Denisov · H. Brendan McMahan · John Rush · Adam Smith · Abhradeep Guha Thakurta -
2021 Poster: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2021 Poster: A Separation Result Between Data-oblivious and Data-aware Poisoning Attacks »
Samuel Deng · Sanjam Garg · Somesh Jha · Saeed Mahloujifar · Mohammad Mahmoody · Abhradeep Guha Thakurta -
2020 Poster: Privacy Amplification via Random Check-Ins »
Borja Balle · Peter Kairouz · Brendan McMahan · Om Thakkar · Abhradeep Guha Thakurta -
2018 Poster: The Limits of Post-Selection Generalization »
Jonathan Ullman · Adam Smith · Kobbi Nissim · Uri Stemmer · Thomas Steinke -
2017 Poster: Renyi Differential Privacy Mechanisms for Posterior Sampling »
Joseph Geumlek · Shuang Song · Kamalika Chaudhuri -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song