Skip to yearly menu bar Skip to main content


Poster

Unsupervised Feature Selection for the kk-means Clustering Problem

Christos Boutsidis · Michael W Mahoney · Petros Drineas


Abstract: We present a novel feature selection algorithm for the kk-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ϵ(0,1)ϵ(0,1), selects and appropriately rescales in an unsupervised manner Θ(klog(k/ϵ)/ϵ2)Θ(klog(k/ϵ)/ϵ2) features from a dataset of arbitrary dimensions. We prove that, if we run any γγ-approximate kk-means algorithm (γ1γ1) on the features selected using our method, we can find a (1+(1+ϵ)γ)(1+(1+ϵ)γ)-approximate partition with high probability.

Live content is unavailable. Log in and register to view live content