Timezone: »
Poster
Distributed k-means and k-median clustering on general communication topologies
Maria-Florina F Balcan · Steven Ehrlich · Yingyu Liang
Sat Dec 07 07:00 PM -- 11:59 PM (PST) @ Harrah's Special Events Center, 2nd Floor
This paper provides new algorithms for distributed clustering for two popular center-based objectives, $k$-median and $k$-means. These algorithms have provable guarantees and improve communication complexity over existing approaches. Following a classic approach in clustering by \cite{har2004coresets}, we reduce the problem of finding a clustering with low cost to the problem of finding a `coreset' of small size. We provide a distributed method for constructing a global coreset which improves over the previous methods by reducing the communication complexity, and which works over general communication topologies. We provide experimental evidence for this approach on both synthetic and real data sets.
Author Information
Maria-Florina F Balcan (Georgia Tech)
Steven Ehrlich (Georgia Tech)
Yingyu Liang (Princeton University)
More from the Same Authors
-
2021 Poster: Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles »
Jiefeng Chen · Frederick Liu · Besim Avci · Xi Wu · Yingyu Liang · Somesh Jha -
2016 Poster: Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates »
Yuanzhi Li · Yingyu Liang · Andrej Risteski -
2015 Poster: Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients »
Bo Xie · Yingyu Liang · Le Song -
2014 Poster: Improved Distributed Principal Component Analysis »
Yingyu Liang · Maria-Florina F Balcan · Vandana Kanchanapally · David Woodruff -
2014 Poster: Active Learning and Best-Response Dynamics »
Maria-Florina F Balcan · Christopher Berlind · Avrim Blum · Emma Cohen · Kaushik Patnaik · Le Song -
2014 Poster: Learning Time-Varying Coverage Functions »
Nan Du · Yingyu Liang · Maria-Florina F Balcan · Le Song -
2014 Poster: Scalable Kernel Methods via Doubly Stochastic Gradients »
Bo Dai · Bo Xie · Niao He · Yingyu Liang · Anant Raj · Maria-Florina F Balcan · Le Song -
2013 Poster: Statistical Active Learning Algorithms »
Maria-Florina F Balcan · Vitaly Feldman -
2008 Workshop: New Challanges in Theoretical Machine Learning: Data Dependent Concept Spaces »
Maria-Florina F Balcan · Shai Ben-David · Avrim Blum · Kristiaan Pelckmans · John Shawe-Taylor