Embedding Jets with Maximum Manifold Capacity Representations
Abstract
Self-supervised learning (SSL) has emerged as the dominant paradigm for training particle physics foundation models. Existing methods largely borrow from language modeling (e.g. masked-/next-token prediction) or computer vision (e.g. multi-view similarity objectives). In this work, we explore an alternative technique based on manifold capacity theory: a neuroscience-inspired method to quantify the linear separability of manifolds (point clouds) using their intrinsic geometric features. We apply the recently developed Maximum Manifold Capacity Representations (MMCR) technique to learn representations of simulated particle jets from hadron colliders, finding that MMCR matches or slightly surpasses similarity-based objectives (SimCLR) as measured by linear class separability of the learned embeddings. These results position MMCR as a promising approach for representation learning in particle physics, and motivate further study in more complex settings.