Timezone: »
Image to Icosahedral Projection for $\mathrm{SO}(3)$ Object Reasoning from Single-View Images
David Klee · Ondrej Biza · Robert Platt · Robin Walters
Event URL: https://openreview.net/forum?id=e9R-eWUwSUB »
Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a priori model of how the image changes under out-of-plane object rotations. The only $\mathrm{SO}(3)$-equivariant models that currently exist require point cloud or voxel input rather than 2D images. In this paper, we propose a novel architecture based on icosahedral group convolutions that reasons in $\mathrm{SO(3)}$ by learning a projection of the input image onto an icosahedron. The resulting model is approximately equivariant to rotation in $\mathrm{SO}(3)$. We apply this model to object pose estimation and shape classification tasks and find that it outperforms reasonable baselines.
Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a priori model of how the image changes under out-of-plane object rotations. The only $\mathrm{SO}(3)$-equivariant models that currently exist require point cloud or voxel input rather than 2D images. In this paper, we propose a novel architecture based on icosahedral group convolutions that reasons in $\mathrm{SO(3)}$ by learning a projection of the input image onto an icosahedron. The resulting model is approximately equivariant to rotation in $\mathrm{SO}(3)$. We apply this model to object pose estimation and shape classification tasks and find that it outperforms reasonable baselines.
Author Information
David Klee (Northeastern University)
Ondrej Biza (Northeastern University, Google Brain)
Robert Platt (Northeastern University)
Robin Walters (Northeastern University)
More from the Same Authors
-
2022 : A Noether's theorem for gradient flow: Continuous symmetries of the architecture and conserved quantities of gradient flow »
Bo Zhao · Iordan Ganev · Robin Walters · Rose Yu · Nima Dehmamy -
2022 : Charting Flat Minima Using the Conserved Quantities of Gradient Flow »
Bo Zhao · Iordan Ganev · Robin Walters · Rose Yu · Nima Dehmamy -
2022 : Understanding Optimization Challenges when Encoding to Geometric Structures »
Babak Esmaeili · Robin Walters · Heiko Zimmermann · Jan-Willem van de Meent -
2022 : Spatial Symmetry in Slot Attention »
Ondrej Biza · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Gamaleldin Elsayed · Aravindh Mahendran · Thomas Kipf -
2022 Poster: Meta-Learning Dynamics Forecasting Using Task Inference »
Rui Wang · Robin Walters · Rose Yu -
2022 Poster: Symmetry Teleportation for Accelerated Optimization »
Bo Zhao · Nima Dehmamy · Robin Walters · Rose Yu -
2021 Poster: Automatic Symmetry Discovery with Lie Algebra Convolutional Network »
Nima Dehmamy · Robin Walters · Yanchen Liu · Dashun Wang · Rose Yu