Processing math: 100%
Skip to yearly menu bar Skip to main content


Poster

The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space

Adam Smith · Shuang Song · Abhradeep Guha Thakurta

Poster Session 2 #648

Abstract: We revisit the problem of counting the number of distinct elements \dist in a data stream D, over a domain [u]. We propose an (ϵ,δ)-differentially private algorithm that approximates \dist within a factor of (1±γ), and with additive error of O(ln(1/δ)/ϵ), using space O(ln(ln(u)/γ)/γ2). We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive error. Our additive error guarantee is optimal up to a factor of O(ln(1/δ)), and the space bound is optimal up to a factor of O(min{ln(ln(u)γ),1γ2}). We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudorandom functions and appealing to a computational variant of differential privacy, SIM-CDP. Our algorithm is built on top of the celebrated Flajolet-Martin (FM) sketch. We show that FM-sketch is differentially private as is, as long as there are ln(1/δ)/(ϵγ) distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of k i.i.d. random variables is statistically close (in the sense of ϵ-differential privacy) to the maximum of (k+1) i.i.d. samples from the same distribution, as long as k=Ω(1ϵ). Finally, experiments show that our algorithms introduces error within an order of magnitude of the non-private analogues for streams with thousands of distinct elements, even while providing strong privacy guarantee (\eps1).

Chat is not available.