Timezone: »
We present 360-MLC, a self-training method based on multi-view layout consistency for finetuning monocular room-layout models using unlabeled 360-images only. This can be valuable in practical scenarios where a pre-trained model needs to be adapted to a new data domain without using any ground truth annotations. Our simple yet effective assumption is that multiple layout estimations in the same scene must define a consistent geometry regardless of their camera positions. Based on this idea, we leverage a pre-trained model to project estimated layout boundaries from several camera views into the 3D world coordinate. Then, we re-project them back to the spherical coordinate and build a probability function, from which we sample the pseudo-labels for self-training. To handle unconfident pseudo-labels, we evaluate the variance in the re-projected boundaries as an uncertainty value to weight each pseudo-label in our loss function during training. In addition, since ground truth annotations are not available during training nor in testing, we leverage the entropy information in multiple layout estimations as a quantitative metric to measure the geometry consistency of the scene, allowing us to evaluate any layout estimator for hyper-parameter tuning, including model selection without ground truth annotations. Experimental results show that our solution achieves favorable performance against state-of-the-art methods when self-training from three publicly available source datasets to a unique, newly labeled dataset consisting of multi-view images of the same scenes.
Author Information
Bolivar Solarte (National Tsing Hua University)
Chin-Hsuan Wu (National Tsing Hua University)
Yueh-Cheng Liu (National Taiwan University)
Yi-Hsuan Tsai (NEC Labs America)
Min Sun (Appier, Inc.)
More from the Same Authors
-
2022 : VOTING-BASED APPROACHES FOR DIFFERENTIALLY PRIVATE FEDERATED LEARNING »
Yuqing Zhu · Xiang Yu · Yi-Hsuan Tsai · Francesco Pittaluga · Masoud Faraki · Manmohan Chandraker · Yu-Xiang Wang -
2023 Poster: Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection »
Cheng-Ju Ho · Chen-Hsuan Tai · Yen-Yu Lin · Ming-Hsuan Yang · Yi-Hsuan Tsai -
2021 Poster: End-to-end Multi-modal Video Temporal Grounding »
Yi-Wen Chen · Yi-Hsuan Tsai · Ming-Hsuan Yang -
2020 Poster: Mitigating Forgetting in Online Continual Learning via Instance-Aware Parameterization »
Hung-Jen Chen · An-Chieh Cheng · Da-Cheng Juan · Wei Wei · Min Sun -
2019 : Poster Session 2 »
Hanson Wang · Yujun Lin · Yixiao Duan · Aditya Paliwal · Ameer Haj-Ali · Ryan Marcus · Tom Hope · Qiumin Xu · Nham Le · Yuxiang Sun · Ross Cutler · Vikram Nathan · Min Sun