Timezone: »
This paper proposes a simple self-supervised approach for learning a representation for visual correspondence from raw video. We cast correspondence as prediction of links in a space-time graph constructed from video. In this graph, the nodes are patches sampled from each frame, and nodes adjacent in time can share a directed edge. We learn a representation in which pairwise similarity defines transition probability of a random walk, such that prediction of long-range correspondence is computed as a walk along the graph. We optimize the representation to place high probability along paths of similarity. Targets for learning are formed without supervision, by cycle-consistency: the objective is to maximize the likelihood of returning to the initial node when walking along a graph constructed from a palindrome of frames. Thus, a single path-level constraint implicitly supervises chains of intermediate comparisons. When used as a similarity metric without adaptation, the learned representation outperforms the self-supervised state-of-the-art on label propagation tasks involving objects, semantic parts, and pose. Moreover, we demonstrate that a technique we call edge dropout, as well as self-supervised adaptation at test-time, further improve transfer for object-centric correspondence.
Author Information
Allan Jabri (UC Berkeley)
Andrew Owens (UC Berkeley)
Alexei Efros (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 : FlowDB: A new large scale river flow, flash flood, and precipitation dataset »
Dates n/a. Room None -
2020 Poster: Space-Time Correspondence as a Contrastive Random Walk »
Wed Dec 9th 05:00 -- 07:00 AM Room Poster Session 2
More from the Same Authors
-
2020 Poster: Swapping Autoencoder for Deep Image Manipulation »
Taesung Park · Jun-Yan Zhu · Oliver Wang · Jingwan Lu · Eli Shechtman · Alexei Efros · Richard Zhang -
2019 Poster: Unsupervised Curricula for Visual Meta-Reinforcement Learning »
Allan Jabri · Kyle Hsu · Abhishek Gupta · Ben Eysenbach · Sergey Levine · Chelsea Finn -
2019 Spotlight: Unsupervised Curricula for Visual Meta-Reinforcement Learning »
Allan Jabri · Kyle Hsu · Abhishek Gupta · Ben Eysenbach · Sergey Levine · Chelsea Finn -
2019 Poster: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2019 Spotlight: Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity »
Deepak Pathak · Christopher Lu · Trevor Darrell · Phillip Isola · Alexei Efros -
2017 Poster: Toward Multimodal Image-to-Image Translation »
Jun-Yan Zhu · Richard Zhang · Deepak Pathak · Trevor Darrell · Alexei Efros · Oliver Wang · Eli Shechtman