Timezone: »

Random Projections for $k$-means Clustering
Christos Boutsidis · Anastasios Zouzias · Petros Drineas

Tue Dec 07 12:00 AM -- 12:00 AM (PST) @ None #None
This paper discusses the topic of dimensionality reduction for $k$-means clustering. We prove that any set of $n$ points in $d$ dimensions (rows in a matrix $A \in \RR^{n \times d}$) can be projected into $t = \Omega(k / \eps^2)$ dimensions, for any $\eps \in (0,1/3)$, in $O(n d \lceil \eps^{-2} k/ \log(d) \rceil )$ time, such that with constant probability the optimal $k$-partition of the point set is preserved within a factor of $2+\eps$. The projection is done by post-multiplying $A$ with a $d \times t$ random matrix $R$ having entries $+1/\sqrt{t}$ or $-1/\sqrt{t}$ with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.

Author Information

Christos Boutsidis (IBM T.J. Watson Research)
Anastasios Zouzias (University of Toronto)
Petros Drineas (Rensselaer Polytechnic Institute)

More from the Same Authors