The Geometry of Thought
Kalyan Cherukuri
Abstract
We propose the \emph{Geometric Hypothesis of Thought}: natural language and reasoning modalities are better modeled by a \emph{product} of specialized geometric spaces rather than a single high-dimensional Euclidean embedding. Motivated by theoretical and empirical work on hyperbolic and spherical embeddings and by recent analyses of foundation model activation geometry, we introduce the Mixed-Curvature Language Model (MC-LM), a transformer variant whose token representations live on a product manifold $\mathcal{M}= \mathbb{H}^n \times \mathbb{E}^m \times \mathbb{S}^k$. The MC-LM uses manifold-specific attention streams and a learned geometric gating mechanism to route information to the most appropriate curvature channel. We describe the architecture, provide principled manifold-aware adaptations of transformer components, and present targeted experiments that isolate hierarchical, analogical, and cyclical reasoning. We further report large-scale evaluation protocols designed to reveal geometric specialization on downstream tasks. Across toy and large-scale benchmarks we find that MC-LM reduces perplexity, improves task-specific accuracy, and yields an interpretable ``Geometric Activation Score'' that illuminates which geometric subspace the model relies on for a given input. Our design synthesizes prior work on non-Euclidean embeddings, intrinsic dimensionality, and geometry-aware networks
Chat is not available.
Successful Page Load