Timezone: »
A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: of labeled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningful (pseudo-) labels for unlabelled datasets without supervision, this development is missing for the video domain where learning feature representations is the current focus. In this work, we a) show that unsupervised labelling of a video dataset does not come for free from strong feature encoders and b) propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations, by leveraging the natural correspondence between audio and visual modalities. An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels. We further introduce the first benchmarking results on unsupervised labelling of common video datasets.
Author Information
Yuki Asano (University of Oxford)
Mandela Patrick (University of Oxford)
Christian Rupprecht (University of Oxford)
Andrea Vedaldi (University of Oxford / Facebook AI Research)
More from the Same Authors
-
2021 : PASS: An ImageNet replacement for self-supervised pretraining without humans »
Yuki Asano · Christian Rupprecht · Andrew Zisserman · Andrea Vedaldi -
2021 : ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation »
Laurynas Karazija · Iro Laina · Christian Rupprecht -
2021 : PASS: An ImageNet replacement for self-supervised pretraining without humans »
Yuki Asano · Christian Rupprecht · Andrew Zisserman · Andrea Vedaldi -
2022 : Self-Guided Diffusion Model »
TAO HU · David Zhang · Yuki Asano · Gertjan Burghouts · Cees Snoek -
2023 Workshop: Causal Representation Learning »
Sara Magliacane · Atalanti Mastakouri · Yuki Asano · Claudia Shi · Cian Eastwood · Sébastien Lachapelle · Bernhard Schölkopf · Caroline Uhler -
2022 Workshop: Self-Supervised Learning: Theory and Practice »
Ishan Misra · Pengtao Xie · Gul Varol · Yale Song · Yuki Asano · Xiaolong Wang · Pauline Luc -
2022 Poster: Unsupervised Multi-Object Segmentation by Predicting Probable Motion Patterns »
Laurynas Karazija · Subhabrata Choudhury · Iro Laina · Christian Rupprecht · Andrea Vedaldi -
2021 Poster: Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers »
Mandela Patrick · Dylan Campbell · Yuki Asano · Ishan Misra · Florian Metze · Christoph Feichtenhofer · Andrea Vedaldi · João Henriques -
2021 Oral: Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers »
Mandela Patrick · Dylan Campbell · Yuki Asano · Ishan Misra · Florian Metze · Christoph Feichtenhofer · Andrea Vedaldi · João Henriques -
2021 Poster: Unsupervised Part Discovery from Contrastive Reconstruction »
Subhabrata Choudhury · Iro Laina · Christian Rupprecht · Andrea Vedaldi -
2021 Poster: Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models »
Hannah Rose Kirk · Yennie Jun · Filippo Volpin · Haider Iqbal · Elias Benussi · Frederic Dreyer · Aleksandar Shtedritski · Yuki Asano -
2020 Poster: Continuous Surface Embeddings »
Natalia Neverova · David Novotny · Marc Szafraniec · Vasil Khalidov · Patrick Labatut · Andrea Vedaldi -
2020 Poster: Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction »
David Novotny · Roman Shapovalov · Andrea Vedaldi -
2020 Poster: 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data »
Benjamin Biggs · David Novotny · Sebastien Ehrhardt · Hanbyul Joo · Ben Graham · Andrea Vedaldi -
2020 Spotlight: 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data »
Benjamin Biggs · David Novotny · Sebastien Ehrhardt · Hanbyul Joo · Ben Graham · Andrea Vedaldi -
2019 Poster: Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels »
Natalia Neverova · David Novotny · Andrea Vedaldi