Learning Permuted Congruential Sequences with Transformers
Abstract
We use pseudo-random number generators (PRNGs) as a controlled benchmark to probe Transformers’ ability to uncover hidden recurrence. Focusing on Permuted Congruential Generators (PCGs), which combine linear recurrences and bit-wise shift, XOR, rotation and truncation operations. We show that Transformers can successfully perform in-context prediction on unseen sequences from diverse PCG variants, in tasks that are beyond published classical attacks. Surprisingly, we find even when the output is truncated to a single bit, it can be reliably predicted by the model. We analyze embedding layers and uncover a novel clustering phenomenon: the model spontaneously groups the integer inputs into bitwise rotationally-invariant clusters, revealing how the model processes the input sequences.