Timezone: »
Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning. Slot-based neural networks have recently shown promise at discovering and representing objects in visual scenes in a self-supervised fashion. While they make use of permutation symmetry of objects to drive learning of abstractions, they largely ignore other spatial symmetries present in the visual world. In this work, we introduce a simple, yet effective, method for incorporating spatial symmetries in attentional slot-based methods. We incorporate equivariance to translation and scale into the attention and generation mechanism of Slot Attention solely via translating and scaling positional encodings. Both changes result in little computational overhead, are easy to implement, and can result in large gains in data efficiency and scene decomposition performance.
Author Information
Ondrej Biza (Northeastern University, Google Brain)
Sjoerd van Steenkiste (Google Research)
Mehdi S. M. Sajjadi (Google)
Gamaleldin Elsayed (Google Research, Brain Team)
Aravindh Mahendran (Google)
Thomas Kipf (Google Research)
More from the Same Authors
-
2021 : Exploring through Random Curiosity with General Value Functions »
Aditya Ramesh · Louis Kirsch · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2021 : Unsupervised Learning of Temporal Abstractions using Slot-based Transformers »
Anand Gopalakrishnan · Kazuki Irie · Jürgen Schmidhuber · Sjoerd van Steenkiste -
2021 : Unsupervised Learning of Temporal Abstractions using Slot-based Transformers »
Anand Gopalakrishnan · Kazuki Irie · Jürgen Schmidhuber · Sjoerd van Steenkiste -
2022 : Neural Network Online Training with Sensitivity to Multiscale Temporal Structure »
Matt Jones · Tyler Scott · Gamaleldin Elsayed · Mengye Ren · Katherine Hermann · David Mayo · Michael Mozer -
2022 : Test-time adaptation with slot-centric models »
Mihir Prabhudesai · Sujoy Paul · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Anirudh Goyal · Deepak Pathak · Katerina Fragkiadaki · Gaurav Aggarwal · Thomas Kipf -
2022 : Image to Icosahedral Projection for $\mathrm{SO}(3)$ Object Reasoning from Single-View Images »
David Klee · Ondrej Biza · Robert Platt · Robin Walters -
2022 : Teacher-generated pseudo human spatial-attention labels boost contrastive learning models »
Yushi Yao · Chang Ye · Junfeng He · Gamaleldin Elsayed -
2022 : Test-time adaptation with slot-centric models »
Mihir Prabhudesai · Sujoy Paul · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Anirudh Goyal · Deepak Pathak · Katerina Fragkiadaki · Gaurav Aggarwal · Thomas Kipf -
2023 Poster: Equivariant Single View Pose Prediction Via Induced and Restriction Representations »
Owen Howell · David Klee · Ondrej Biza · Linfeng Zhao · Robin Walters -
2022 Workshop: Workshop on neuro Causal and Symbolic AI (nCSI) »
Matej Zečević · Devendra Dhami · Christina Winkler · Thomas Kipf · Robert Peharz · Petar Veličković -
2022 Poster: SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos »
Gamaleldin Elsayed · Aravindh Mahendran · Sjoerd van Steenkiste · Klaus Greff · Michael Mozer · Thomas Kipf -
2022 Poster: Exploring through Random Curiosity with General Value Functions »
Aditya Ramesh · Louis Kirsch · Sjoerd van Steenkiste · Jürgen Schmidhuber -
2022 Poster: Object Scene Representation Transformer »
Mehdi S. M. Sajjadi · Daniel Duckworth · Aravindh Mahendran · Sjoerd van Steenkiste · Filip Pavetic · Mario Lucic · Leonidas Guibas · Klaus Greff · Thomas Kipf -
2020 Workshop: Object Representations for Learning and Reasoning »
William Agnew · Rim Assouel · Michael Chang · Antonia Creswell · Eliza Kosoy · Aravind Rajeswaran · Sjoerd van Steenkiste -
2020 Poster: Object-Centric Learning with Slot Attention »
Francesco Locatello · Dirk Weissenborn · Thomas Unterthiner · Aravindh Mahendran · Georg Heigold · Jakob Uszkoreit · Alexey Dosovitskiy · Thomas Kipf -
2020 Spotlight: Object-Centric Learning with Slot Attention »
Francesco Locatello · Dirk Weissenborn · Thomas Unterthiner · Aravindh Mahendran · Georg Heigold · Jakob Uszkoreit · Alexey Dosovitskiy · Thomas Kipf -
2019 Poster: Are Disentangled Representations Helpful for Abstract Visual Reasoning? »
Sjoerd van Steenkiste · Francesco Locatello · Jürgen Schmidhuber · Olivier Bachem -
2019 Poster: Saccader: Improving Accuracy of Hard Attention Models for Vision »
Gamaleldin Elsayed · Simon Kornblith · Quoc V Le -
2018 : Panel »
Paroma Varma · Aditya Grover · Will Hamilton · Jessica Hamrick · Thomas Kipf · Marinka Zitnik -
2018 : Compositional Imitation Learning: Explaining and executing one task at a time »
Thomas Kipf -
2018 Poster: Large Margin Deep Networks for Classification »
Gamaleldin Elsayed · Dilip Krishnan · Hossein Mobahi · Kevin Regan · Samy Bengio -
2018 Poster: Adversarial Examples that Fool both Computer Vision and Time-Limited Humans »
Gamaleldin Elsayed · Shreya Shankar · Brian Cheung · Nicolas Papernot · Alexey Kurakin · Ian Goodfellow · Jascha Sohl-Dickstein -
2017 : Relational neural expectation maximization »
Sjoerd van Steenkiste -
2017 Poster: Neural Expectation Maximization »
Klaus Greff · Sjoerd van Steenkiste · Jürgen Schmidhuber