Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE
Abstract
The InfoNCE loss in contrastive learning depends critically on a temperature parameter, yet its dynamics under fixed versus annealed schedules remain poorly understood. We provide a theoretical analysis by modeling embedding evolution under Langevin dynamics on a compact Riemannian manifold. Under mild smoothness and energy-barrier assumptions, we show that classical simulated annealing guarantees extend to this setting: slow logarithmic inverse-temperature schedules ensure convergence in probability to a globally optimal representation, while faster schedules risk trapping in suboptimal minima. Our results establish a formal link between contrastive learning and simulated annealing, providing a principled basis for understanding and tuning temperature schedules.