ImmunoOncoAtlas: Towards Next Generation Immuno-Oncology
Abstract
ImmunoOncoAtlas is a next-generation open dataset for AI-driven immuno-oncology, centered on tumor-specific T cells (TSTs) within the tumor immune microenvironment. It supports two tasks: (i) predicting context-specific mechanisms of TST activation, exhaustion, and suppression, and (ii) recommending novel interventions—target combinations, dosing regimens, and next-generation antibodies—with improved efficacy–toxicity trade-offs. Uniquely, it fuses full-text, figure-aware research articles with annotated therapeutic patents, aligned via paragraph-/figure-level provenance and enriched with metadata (cancer type, cell subtype, intervention class, outcomes, toxicity) to yield mechanism→outcome labels at scale. Feasibility and rigor follow an agentic, human-in-the-loop pipeline: agentic sourcing with expert triage; multimodal AI pre-annotation with human aggregation; and a closed-loop “digital peer review” for inconsistency detection and active learning. We will release the schema, guidelines, a seed corpus, and baselines for prediction and recommendation. By providing a shared, mechanistic substrate, ImmunoOncoAtlas aims to become an “ImageNet” for tumor immunology.