Timezone: »
Clustering large data is a fundamental problem with a vast number of applications. Due to the increasing size of data, practitioners interested in clustering have turned to distributed computation methods. In this work, we consider the widely used k-center clustering problem and its variant used to handle noisy data, k-center with outliers. In the noise-free setting we demonstrate how a previously-proposed distributed method is actually an O(1)-approximation algorithm, which accurately explains its strong empirical performance. Additionally, in the noisy setting, we develop a novel distributed algorithm that is also an O(1)-approximation. These algorithms are highly parallel and lend themselves to virtually any distributed computing framework. We compare both empirically against the best known noisy sequential clustering methods and show that both distributed algorithms are consistently close to their sequential versions. The algorithms are all one can hope for in distributed settings: they are fast, memory efficient and they match their sequential counterparts.
Author Information
Gustavo Malkomes (Washington University in St. Louis)
Matt J Kusner (Washington University in St. Louis)
Wenlin Chen (Washington University in St. Louis)
Kilian Q Weinberger (Washington University in St. Louis)
Benjamin Moseley (Washington University in St Lo)
More from the Same Authors
-
2022 : On Multi-information source Constraint Active Search »
Gustavo Malkomes · Bolong Cheng · Santiago Miret -
2022 : Group SELFIES: A Robust Fragment-Based Molecular String Representation »
Austin Cheng · Andy Cai · Santiago Miret · Gustavo Malkomes · Mariano Phielipp · Alan Aspuru-Guzik -
2018 Poster: Efficient nonmyopic batch active search »
Shali Jiang · Gustavo Malkomes · Matthew Abbott · Benjamin Moseley · Roman Garnett -
2018 Spotlight: Efficient nonmyopic batch active search »
Shali Jiang · Gustavo Malkomes · Matthew Abbott · Benjamin Moseley · Roman Garnett -
2018 Poster: Automating Bayesian optimization with Bayesian optimization »
Gustavo Malkomes · Roman Garnett -
2016 Poster: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger -
2016 Oral: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger -
2016 Poster: Bayesian optimization for automated model selection »
Gustavo Malkomes · Charles Schaff · Roman Garnett -
2015 Poster: Bayesian Active Model Selection with an Application to Automated Audiometry »
Jacob Gardner · Gustavo Malkomes · Roman Garnett · Kilian Weinberger · Dennis Barbour · John Cunningham -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2012 Poster: Non-linear Metric Learning »
Dor Kedem · Stephen Tyree · Kilian Q Weinberger · Fei Sha · Gert Lanckriet -
2011 Workshop: Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity »
Greg Shakhnarovich · Dhruv Batra · Brian Kulis · Kilian Q Weinberger -
2011 Poster: Co-Training for Domain Adaptation »
Minmin Chen · Kilian Q Weinberger · John Blitzer -
2010 Session: Oral Session 16 »
Kilian Q Weinberger -
2010 Poster: Large Margin Multi-Task Metric Learning »
Shibin Parameswaran · Kilian Q Weinberger -
2010 Poster: Decoding Ipsilateral Finger Movements from ECoG Signals in Humans »
Yuzong Liu · Mohit Sharma · Charles M Gaona · Jonathan D Breshears · jarod Roland · zachary V Freudenburg · Kilian Q Weinberger · Eric C Leuthardt -
2008 Poster: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle -
2008 Spotlight: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle -
2006 Workshop: Novel Applications of Dimensionality Reduction »
John Blitzer · Rajarshi Das · Irina Rish · Kilian Q Weinberger -
2006 Poster: Graph Regularization for Maximum Variance Unfolding with an Application to Sensor Localization »
Kilian Q Weinberger · Fei Sha · Qihui Zhu · Lawrence Saul