Frontline-AI: Building Real-Time Infrastructure for Infectious Disease Response
Abstract
Emerging infectious diseases spread fastest in the narrow window before laboratory confirmation, when frontline clinicians must decide whether to isolate patients, prioritize scarce diagnostic assays, or escalate sequencing—all under severe uncertainty. Yet no open dataset captures this pre-diagnostic decision space, leaving AI systems untested in the very conditions where they could be most impactful. We propose Frontline-AI, the first benchmark dataset of prospectively collected, multimodal patient encounters from outbreak-prone settings. Each record integrates early clinical features (vitals, symptoms, rapid tests, notes), epidemiological exposures (travel, contact history), and facility constraints (staffing, beds, oxygen) with definitive outcomes (PCR and sequencing results). Frontline-AI defines three core tasks: (i) pre-test quarantine triage, (ii) diagnostics and sequencing prioritization, and (iii) hotspot and anomaly detection. Each task is paired with rigorous baselines, evaluation metrics, and reproducible pipelines to support benchmarking. By linking real-time frontline signals to gold-standard outcomes, this dataset enables models to reason under diagnostic uncertainty, optimize scarce resources, and detect emerging outbreaks earlier. Beyond pandemic preparedness, the same benchmarks can accelerate AI for emergency medicine, disaster response, and critical care. Frontline-AI aims to catalyze a new generation of AI systems designed for high-stakes, real-world decision-making in global health.