A Domain Feature Ensemble with AdaLN Conditioned ViT for Weak Lensing Inference
Abstract
Weak lensing convergence maps provide key information for estimating cosmological parameters Ωm and S8, but their low signal-to-noise ratio and rich non-Gaussian structure present significant modeling challenges. To address these issues, we propose a Vision Transformer (ViT) architecture conditioned on handcrafted domain features from cosmology through Adaptive Layer Normalization (AdaLN), and an ensemble selected via greedy search over heterogeneous feature sets and network variants to achieve accurate and robust inference of Ωm and S8 from weak lensing convergence maps. The handcrafted domain features, including power spectrum, peak counts, Minkowski functionals and etc., capture physically interpretable Gaussian and non-Gaussian characteristics of the convergence field. The ViT leverages these features as physics priors via AdaLN to adaptively modulate its hidden representations. To improve robustness and uncertainty estimation, we train a diverse set of model variants using different feature subsets and architectural choices, and construct a final ensemble through greedy selection based on validation performance. Our approach highlights the effectiveness and robustness of combining domain knowledge, deep architectures, and ensemble modeling for weak lensing cosmology.