Timezone: »

NaturalProofs: Mathematical Theorem Proving in Natural Language
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho

Fri Dec 10 12:20 AM -- 12:30 AM (PST) @

Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning. As a step in this direction, we develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs, written in natural mathematical language. NaturalProofs unifies broad coverage, deep coverage, and low-resource mathematical sources, allowing for evaluating both in-distribution and zero-shot generalization. Using NaturalProofs, we benchmark strong neural methods on mathematical reference retrieval and generation tasks which test a system's ability to determine key results that appear in a proof. Large-scale sequence models show promise compared to classical information retrieval methods, yet their performance and out-of-domain generalization leave substantial room for improvement. NaturalProofs opens many avenues for research on challenging mathematical tasks.

Author Information

Sean Welleck (University of Washington)
Jiacheng Liu (Department of Computer Science, University of Washington)
Ronan Le Bras (Allen Institute for AI)
Hanna Hajishirzi (University of Washington)
Yejin Choi (University of Washington)
Kyunghyun Cho (Genentech | New York University)

Kyunghyun Cho is an associate professor of computer science and data science at New York University and a research scientist at Facebook AI Research. He was a postdoctoral fellow at the Université de Montréal until summer 2015 under the supervision of Prof. Yoshua Bengio, and received PhD and MSc degrees from Aalto University early 2014 under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.

More from the Same Authors