Timezone: »
An Empirical Study on Clustering Pretrained Embeddings: Is Deep Strictly Better?
Tyler Scott · Ting Liu · Michael Mozer · Andrew Gallagher
Event URL: https://openreview.net/forum?id=TbEzwuIs__ »
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods---including $k$-means and hierarchical agglomerative clustering---underperform supervised, deep, inductive methods. While the reported improvements are indeed impressive, experiments are mostly limited to face datasets, where the clustered embeddings are highly discriminative or well-separated by class (Recall@1 above 90% and often near ceiling), and the experimental methodology seemingly favors the deep methods. We conduct an empirical study of 14 clustering methods on two popular non-face datasets---Cars196 and Stanford Online Products---and obtain robust, but contentious findings. Notably, deep methods are surprisingly fragile for embeddings with more uncertainty, where they underperform the shallow, heuristic-based methods. We believe our benchmarks broaden the scope of supervised clustering methods beyond the face domain and can serve as a foundation on which these methods could be improved.
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods---including $k$-means and hierarchical agglomerative clustering---underperform supervised, deep, inductive methods. While the reported improvements are indeed impressive, experiments are mostly limited to face datasets, where the clustered embeddings are highly discriminative or well-separated by class (Recall@1 above 90% and often near ceiling), and the experimental methodology seemingly favors the deep methods. We conduct an empirical study of 14 clustering methods on two popular non-face datasets---Cars196 and Stanford Online Products---and obtain robust, but contentious findings. Notably, deep methods are surprisingly fragile for embeddings with more uncertainty, where they underperform the shallow, heuristic-based methods. We believe our benchmarks broaden the scope of supervised clustering methods beyond the face domain and can serve as a foundation on which these methods could be improved.
Author Information
Tyler Scott (University of Colorado, Boulder)
Ting Liu (Google Research)
Michael Mozer (Google Research, Brain Team)
Andrew Gallagher (Google)
More from the Same Authors
-
2021 : Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning »
Nan Rosemary Ke · Aniket Didolkar · Sarthak Mittal · Anirudh Goyal · Guillaume Lajoie · Stefan Bauer · Danilo Jimenez Rezende · Yoshua Bengio · Chris Pal · Michael Mozer -
2021 : Learning Neural Causal Models with Active Interventions »
Nino Scherrer · Olexa Bilaniuk · Yashas Annadani · Anirudh Goyal · Patrick Schwab · Bernhard Schölkopf · Michael Mozer · Yoshua Bengio · Stefan Bauer · Nan Rosemary Ke -
2022 : Neural Network Online Training with Sensitivity to Multiscale Temporal Structure »
Matt Jones · Tyler Scott · Gamaleldin Elsayed · Mengye Ren · Katherine Hermann · David Mayo · Michael Mozer -
2022 : Improving Zero-shot Generalization and Robustness of Multi-modal Models »
Yunhao Ge · Jie Ren · Ming-Hsuan Yang · Yuxiao Wang · Andrew Gallagher · Hartwig Adam · Laurent Itti · Balaji Lakshminarayanan · Jiaping Zhao -
2022 Poster: SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos »
Gamaleldin Elsayed · Aravindh Mahendran · Sjoerd van Steenkiste · Klaus Greff · Michael Mozer · Thomas Kipf -
2021 Poster: Improving Anytime Prediction with Parallel Cascaded Networks and a Temporal-Difference Loss »
Michael Iuzzolino · Michael Mozer · Samy Bengio -
2021 Poster: Soft Calibration Objectives for Neural Networks »
Archit Karandikar · Nicholas Cain · Dustin Tran · Balaji Lakshminarayanan · Jonathon Shlens · Michael Mozer · Becca Roelofs -
2021 Poster: Neural Production Systems »
Anirudh Goyal · Aniket Didolkar · Nan Rosemary Ke · Charles Blundell · Philippe Beaudoin · Nicolas Heess · Michael Mozer · Yoshua Bengio -
2021 Poster: Discrete-Valued Neural Communication »
Dianbo Liu · Alex Lamb · Kenji Kawaguchi · Anirudh Goyal · Chen Sun · Michael Mozer · Yoshua Bengio -
2020 Poster: Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness »
Long Zhao · Ting Liu · Xi Peng · Dimitris Metaxas -
2018 Poster: Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning »
Tyler Scott · Karl Ridgeway · Michael Mozer -
2018 Spotlight: Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning »
Tyler Scott · Karl Ridgeway · Michael Mozer