Poster
Nearly-Tight and Oblivious Algorithms for Explainable Clustering
Buddhima Gamlath · Xinrui Jia · Adam Polak · Ola Svensson
Keywords: [ Interpretability ] [ Optimization ] [ Clustering ]
Abstract:
We study the problem of explainable clustering in the setting first formalized by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). A -clustering is said to be explainable if it is given by a decision tree where each internal node splits data points with a threshold cut in a single dimension (feature), and each of the leaves corresponds to a cluster. We give an algorithm that outputs an explainable clustering that loses at most a factor of compared to an optimal (not necessarily explainable) clustering for the -medians objective, and a factor of for the -means objective. This improves over the previous best upper bounds of and , respectively, and nearly matches the previous lower bound for -medians and our new lower bound for -means. The algorithm is remarkably simple. In particular, given an initial not necessarily explainable clustering in , it is oblivious to the data points and runs in time , independent of the number of data points . Our upper and lower bounds also generalize to objectives given by higher -norms.
Chat is not available.