Data-driven Design as a High-Impact, Ecologically Valid Benchmark for Document Understanding
Sireesh Gururaja · Junwon Seo · Hung-Yi Lin · Jeremiah Milbauer · Anthony Rollett · Emma Strubell
Abstract
Data-driven design (DDD) is viewed in materials science as a promising avenue to accelerate materials discovery by narrowing the search space for candidate materials with desirable properties, and relies on correctly-extracted information from prior literature. Existing methods for DDD-related information extraction, however, rely on either laborious, hand-engineered pipelines, or the annotation of significant amounts of hard-to-collect data. We therefore propose DDD as a benchmark for zero- and few-shot document understanding focused on text, tables, and charts. Accurate generalization to new, unseen material domains is a way to accelerate scientific discovery by enabling the use of DDD in previously unexplored domains.
Chat is not available.
Successful Page Load