Skip to yearly menu bar Skip to main content

Workshop: All Things Attention: Bridging Different Perspectives on Attention

Unlocking Slot Attention by Changing Optimal Transport Costs

Yan Zhang · David Zhang · Simon Lacoste-Julien · Gertjan Burghouts · Cees Snoek

Keywords: [ Attention ] [ object-centric ] [ slot attention ] [ multiset ] [ optimal transport ] [ Equivariance ]


Slot attention is a successful method for object-centric modeling with images and videos for tasks like unsupervised object discovery. However, set-equivariance limits its ability to perform tiebreaking, which makes distinguishing similar structures difficult – a task crucial for vision problems. To fix this, we cast cross-attention in slot attention as an optimal transport (OT) problem that has solutions with the desired tiebreaking properties. We then propose an entropy minimization module that combines the tiebreaking properties of unregularized OT with the speed of regularized OT. We evaluate our method on CLEVR object detection and observe significant improvements from 53% to 91% on a strict average precision metric.

Chat is not available.