Timezone: »
Poster
The FlajoletMartin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
Adam Smith · Shuang Song · Abhradeep Guha Thakurta
We revisit the problem of counting the number of distinct elements $\dist$ in a data stream $D$, over a domain $[u]$. We propose an $(\epsilon,\delta)$differentially private algorithm that approximates $\dist$ within a factor of $(1\pm\gamma)$, and with additive error of $O(\sqrt{\ln(1/\delta)}/\epsilon)$, using space $O(\ln(\ln(u)/\gamma)/\gamma^2)$. We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive error. Our additive error guarantee is optimal up to a factor of $O(\sqrt{\ln(1/\delta)})$, and the space bound is optimal up to a factor of $O\left(\min\left\{\ln\left(\frac{\ln(u)}{\gamma}\right), \frac{1}{\gamma^2}\right\}\right)$. We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudorandom functions and appealing to a computational variant of differential privacy, SIMCDP.
Our algorithm is built on top of the celebrated FlajoletMartin (FM) sketch. We show that FMsketch is differentially private as is, as long as there are $\approx \sqrt{\ln(1/\delta)}/(\epsilon\gamma)$ distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of $k$ i.i.d. random variables is statistically close (in the sense of $\epsilon$differential privacy) to the maximum of $(k+1)$ i.i.d. samples from the same distribution, as long as $k=\Omega\left(\frac{1}{\epsilon}\right)$.
Finally, experiments show that our algorithms introduces error within an order of magnitude of the nonprivate analogues for streams with thousands of distinct elements, even while providing strong privacy guarantee ($\eps\leq 1$).
Author Information
Adam Smith (Boston University)
Shuang Song (Google)
Abhradeep Guha Thakurta (Google Research  Brain Team and UC Santa Cruz)
More from the Same Authors

2020 : Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems »
Shuang Song 
2021 Spotlight: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta 
2022 Poster: Refined DimensionDependent Analysis for Private Convex Learning and Implications for FineTuning »
Xuechen Li · Daogao Liu · Tatsunori Hashimoto · Huseyin A. Inan · Janardhan Kulkarni · YinTat Lee · Abhradeep Guha Thakurta 
2022 Poster: Improved Differential Privacy for SGD via Optimal Private Linear Operators on Adaptive Streams »
Sergey Denisov · H. Brendan McMahan · John Rush · Adam Smith · Abhradeep Guha Thakurta 
2021 Poster: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta 
2021 Poster: A Separation Result Between Dataoblivious and Dataaware Poisoning Attacks »
Samuel Deng · Sanjam Garg · Somesh Jha · Saeed Mahloujifar · Mohammad Mahmoody · Abhradeep Guha Thakurta 
2020 Poster: Privacy Amplification via Random CheckIns »
Borja Balle · Peter Kairouz · Brendan McMahan · Om Thakkar · Abhradeep Guha Thakurta 
2018 Poster: The Limits of PostSelection Generalization »
Jonathan Ullman · Adam Smith · Kobbi Nissim · Uri Stemmer · Thomas Steinke 
2017 Poster: Renyi Differential Privacy Mechanisms for Posterior Sampling »
Joseph Geumlek · Shuang Song · Kamalika Chaudhuri 
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song