Hybrid Attention State Space Models for Symbolic Calculation of Squared Amplitudes
Karaka Naidu · Eric Reinhardt · Victor Baules · Nobuchika Okada · Sergei Gleyzer
Abstract
Calculating squared amplitudes is a key step in computing cross sections needed to compare experimental data with theoretical predictions. However, mapping amplitude expressions to their squared amplitude expressions is computationally expensive. Prior works have formulated this task as a sequence-to-sequence problem, demonstrating the effectiveness of Transformer-based encoder–decoder architectures. Despite these successes, such approaches have been limited to relatively short sequences and fail to scale effectively to longer inputs, primarily due to the limitations of attention mechanisms in handling extended context windows. State Space Models (SSMs), such as Mamba, offer a recurrent alternative that can achieve performance comparable to that of Transformers on some tasks. In this work, we investigate hybrid Attention and SSM architectures and show that they outperform Vanilla Transformers in a low-data, long-sequence task. The hybrid Attention-SSMs achieving up to a $\sim$4\% improvement in token accuracy and $\sim$40\% improvement in full sequence accuracy on the task of calculating squared amplitudes for electroweak physics processes.
Chat is not available.
Successful Page Load