An LLM-driven framework for cosmological model-building and exploration
Abstract
Our understanding of cosmic evolution relies on dark energy and dark matter—mysterious components detectable only through gravitational effects despite comprising 95% of the Universe today. Recent surveys reveal systematic discrepancies in dark energy's temporal evolution, potentially indicating new physics. Given Large Language Models' success at research-level coding and mathematical reasoning, we investigate LLMs' capability to autonomously propose, implement, and test cosmological models. We challenge an agentic LLM (Claude Code) in three settings: (1) implementing alternative cosmological models from curated descriptions by modifying a physics simulation codebase, (2) implementing those from research papers directly, and (3) generating novel dark energy hypotheses to explain recent observations. When given curated descriptions with numerical implementation tips, Claude Code successfully implements both test models—"Thawing Quintessence" and "Early Dark Energy"—with different numerical accuracy levels reflecting the model complexity. Working from papers rather than curated descriptions significantly degrades numerical accuracy though qualitative behavior remains correct. Interestingly, Claude Code's self-proposed dark energy model achieved comparable observational fits to the standard model, though requiring additional parameters.