Poster
Near Neighbor: Who is the Fairest of Them All?
Sariel Har-Peled · Sepideh Mahabadi
East Exhibition Hall B, C #35
Keywords: [ Algorithms ] [ Fairness, Accountabil ] [ Algorithms -> Classification; Algorithms -> Clustering; Algorithms -> Density Estimation; Applications ]
[
Abstract
]
Abstract:
In this work we study a "fair" variant of the near neighbor problem. Namely, given a set of points and a parameter , the goal is to preprocess the points, such that given a query point , any point in the -neighborhood of the query, i.e., , have the same probability of being reported as the near neighbor.
We show that LSH based algorithms can be made fair, without a significant loss in efficiency. Specifically, we show an algorithm that reports a point in the -neighborhood of a query with almost uniform probability. The time to report such a point is proportional to , and its space is , where and are the query time and space of an LSH algorithm for -approximate near neighbor, and is a function of the local density around .
Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. Finally, we run experiments to show performance of our approach on real data.
Live content is unavailable. Log in and register to view live content