Neural Fields Meet Attention
Kalyan Cherukuri · Aarav Lala
Abstract
We establish a mathematical connection between neural field optimization and Transformer attention mechanics. First, we prove that Transformer-based operators learning a neural field are equivariant to affine transformations (translations and positive scalings) when using relative positional encodings and coordinate normalization, extending geometric deep learning to meta-learning of continuous functions. Second, we demonstrate that linear attention is an exact computation of the negative gradient of squared-error loss for sinusoidal neural fields, with softmax attention shown empirically and theoretically to converge to such an identity at rate $O(\tau^{-2})$ as temperature scales. The novel results reveal that attention mechanisms have an implicit geometric encoding that is well-suited to learn continuous functions.
Chat is not available.
Successful Page Load