Poster
in
Workshop: Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling

Agentic Lean Auformalization (ALA): An LLM collaborative approach to autoformalization in LEAN

Patricio Gallardo · Maziar Raissi · Ke Zhang · Sudhir Murthy

Project Page [ Slides] [ OpenReview]

Abstract

The arrival of AI systems that can achieve a gold medal at the International Mathematical Olympiad (IMO) and the development of proof assistants such as Lean seem to foretell a transformative revolution in mathematical research. However, a bottleneck is that most undergraduate- and graduate-level theorems are not translated into code for proof assistants, a process known as autoformalization.State-of-the-art fine-tuned LLMs in Lean 4 report at most 22.5\% accuracy (Pass@128) on graduate-level theorems. To address this gap, we propose and evaluate ALA, an agentic framework where a generalist LLM orchestrating tools works together with another LLM fine-tuned in Lean 4. ALA achieves a 52\% accuracy with less than 13 tool-calls on theorems from areas such as complex and real analysis, topology, and algebra. Our code and the related dataset are published on GitHub.

Chat is not available.