Poster
On Generalization Bounds for Projective Clustering
Maria Sofia Bucarelli · Matilde Larsen · Chris Schwiegelshohn · Mads Toftrup
Great Hall & Hall B1+B2 (level 1) #1821
Abstract:
Given a set of points, clustering consists of finding a partition of a point set into clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the famous -median and -means objectives. One may also choose centers to be dimensional subspaces, which gives rise to subspace clustering. In this paper, we consider learning bounds for these problems. That is, given a set of samples drawn independently from some unknown, but fixed distribution , how quickly does a solution computed on converge to the optimal clustering of ?We give several near optimal results. In particular, 1. For center-based objectives, we show a convergence rate of . This matches the known optimal bounds of [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] and [Bartlett, Linder, and Lugosi, IEEE Trans. Inf. Theory 1998] for -means and extends it to other important objectives such as -median. 2. For subspace clustering with -dimensional subspaces, we show a convergence rate of . These are the first provable bounds for most of these problems. For the specific case of projective clustering, which generalizes -means, we show a converge rate of is necessary, thereby proving that the bounds from [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] are essentially optimal.
Chat is not available.