Timezone: »
We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach. From an input monocular RGB video, the video frames are processed by a transformer network that fuses the observations into a volumetric feature grid representing the scene; this feature grid is then decoded into an implicit 3D scene representation. Key to our approach is the transformer architecture that enables the network to learn to attend to the most relevant image frames for each 3D location in the scene, supervised only by the scene reconstruction task. Features are fused in a coarse-to-fine fashion, storing fine-level features only where needed, requiring lower memory storage and enabling fusion at interactive rates. The feature grid is then decoded to a higher-resolution scene reconstruction, using an MLP-based surface occupancy prediction from interpolated coarse-to-fine 3D features. Our approach results in an accurate surface reconstruction, outperforming state-of-the-art multi-view stereo depth estimation methods, fully-convolutional 3D reconstruction approaches, and approaches using LSTM- or GRU-based recurrent networks for video sequence fusion.
Author Information
Aljaz Bozic (Technical University of Munich)
Pablo Palafox (Technical University Munich)
Justus Thies (Max Planck Institute for Intelligent Systems)
Angela Dai (Technical University of Munich)
Matthias Niessner (Technical University of Munich)
More from the Same Authors
-
2022 Poster: PatchComplete: Learning Multi-Resolution Patch Priors for 3D Shape Completion on Unseen Categories »
Yuchen Rao · Yinyu Nie · Angela Dai -
2022 Spotlight: Lightning Talks 6A-3 »
Junyu Xie · Chengliang Zhong · Ali Ayub · Sravanti Addepalli · Harsh Rangwani · Jiapeng Tang · Yuchen Rao · Zhiying Jiang · Yuqi Wang · Xingzhe He · Gene Chou · Ilya Chugunov · Samyak Jain · Yuntao Chen · Weidi Xie · Sumukh K Aithal · Carter Fendley · Lev Markhasin · Yiqin Dai · Peixing You · Bastian Wandt · Yinyu Nie · Helge Rhodin · Felix Heide · Ji Xin · Angela Dai · Andrew Zisserman · Bi Wang · Xiaoxue Chen · Mayank Mishra · ZHAO-XIANG ZHANG · Venkatesh Babu R · Justus Thies · Ming Li · Hao Zhao · Venkatesh Babu R · Jimmy Lin · Fuchun Sun · Matthias Niessner · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: PatchComplete: Learning Multi-Resolution Patch Priors for 3D Shape Completion on Unseen Categories »
Yuchen Rao · Yinyu Nie · Angela Dai -
2022 Spotlight: Neural Shape Deformation Priors »
Jiapeng Tang · Lev Markhasin · Bi Wang · Justus Thies · Matthias Niessner -
2022 Spotlight: 3DILG: Irregular Latent Grids for 3D Generative Modeling »
Biao Zhang · Matthias Niessner · Peter Wonka -
2022 Poster: Neural Shape Deformation Priors »
Jiapeng Tang · Lev Markhasin · Bi Wang · Justus Thies · Matthias Niessner -
2022 Poster: 3DILG: Irregular Latent Grids for 3D Generative Modeling »
Biao Zhang · Matthias Niessner · Peter Wonka -
2022 Poster: The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes »
Peter Kocsis · Peter Súkeník · Guillem Braso · Matthias Niessner · Laura Leal-Taixé · Ismail Elezi -
2021 Poster: Panoptic 3D Scene Reconstruction From a Single RGB Image »
Manuel Dahnert · Ji Hou · Matthias Niessner · Angela Dai -
2020 : Angela Dai - Self-supervised generation of 3D shapes and scenes »
Angela Dai -
2020 Poster: Neural Non-Rigid Tracking »
Aljaz Bozic · Pablo Palafox · Michael Zollhöfer · Angela Dai · Justus Thies · Matthias Niessner