Poster
in
Workshop: The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models

Composite Attention: A Framework for Combining Sequence Mixing Primitives

Jake Cunningham · Marc Deisenroth

Keywords: Efficient Architectures

Abstract

Hybrid attention architectures have shown promising success in both equipping self attention with inductive bias for long-sequence modelling and reducing the computational burden of transformers without sacrificing quality. This paper introduces Composite Attention, a theoretical framework for analyzing the combination of sequence mixing primitives in modern deep learning architectures. Utilizing the definition of sequence mixers as structured linear maps, we formalize the composition of sequence mixing primitives as either sequential or recurrent composition.

Video

Chat is not available.