Timezone: »
Video object segmentation (VOS) describes the task of segmenting a set of objects in each frame of a video. In the semi-supervised setting, the first mask of each object is provided at test time. Following the one-shot principle, fine-tuning VOS methods train a segmentation model separately on each given object mask. However, recently the VOS community has deemed such a test time optimization and its impact on the test runtime as unfeasible. To mitigate the inefficiencies of previous fine-tuning approaches, we present efficient One-Shot Video Object Segmentation (e-OSVOS). In contrast to most VOS approaches, e-OSVOS decouples the object detection task and predicts only local segmentation masks by applying a modified version of Mask R-CNN. The one-shot test runtime and performance are optimized without a laborious and handcrafted hyperparameter search. To this end, we meta learn the model initialization and learning rates for the test time optimization. To achieve an optimal learning behavior, we predict individual learning rates at a neuron level. % a pair of learning rates for the weights tensor and scalar bias of each neuron. Furthermore, we apply an online adaptation to address the common performance degradation throughout a sequence by continuously fine-tuning the model on previous mask predictions supported by a frame-to-frame bounding box propagation. % through changing online appearance -> online adaptation for free. bounding box propagation. e-OSVOS provides state-of-the-art results on DAVIS 2016, DAVIS 2017 and YouTube-VOS for one-shot fine-tuning methods while reducing the test runtime substantially.
Author Information
Tim Meinhardt (TUM)
Laura Leal-Taixé (TUM)
More from the Same Authors
-
2021 : STEP: Segmenting and Tracking Every Pixel »
Mark Weber · Jun Xie · Maxwell Collins · Yukun Zhu · Paul Voigtlaender · Hartwig Adam · Bradley Green · Andreas Geiger · Bastian Leibe · Daniel Cremers · Aljosa Osep · Laura Leal-Taixé · Liang-Chieh Chen -
2021 : DENETHOR: The DynamicEarthNET dataset for Harmonized, inter-Operable, analysis-Ready, daily crop monitoring from space »
Lukas Kondmann · Aysim Toker · Marc Rußwurm · Andrés Camero · Devis Peressuti · Grega Milcinski · Pierre-Philippe Mathieu · Nicolas Longepe · Timothy Davis · Giovanni Marchisio · Laura Leal-Taixé · Xiaoxiang Zhu -
2022 : PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking? »
Aleksandr Kim · Guillem Braso · Aljosa Osep · Laura Leal-Taixé -
2022 : PolarMOT: How far can geometric relations take us in 3D multi-object tracking? »
Aleksandr Kim · Guillem Braso · Aljosa Osep · Laura Leal-Taixé -
2022 Poster: Learning to Discover and Detect Objects »
Vladimir Fomenko · Ismail Elezi · Deva Ramanan · Laura Leal-Taixé · Aljosa Osep -
2022 Poster: Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking? »
Patrick Dendorfer · Vladimir Yugay · Aljosa Osep · Laura Leal-Taixé -
2022 Poster: The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes »
Peter Kocsis · Peter Súkeník · Guillem Braso · Matthias Niessner · Laura Leal-Taixé · Ismail Elezi -
2020 Poster: Deep Shells: Unsupervised Shape Correspondence with Optimal Transport »
Marvin Eisenberger · Aysim Toker · Laura Leal-Taixé · Daniel Cremers