NeurIPS Expo Demonstration ALICE: Agentic Logic for Incident and Codebug Elimination

Expo Demonstration

ALICE: Agentic Logic for Incident and Codebug Elimination

Ramesh Kumar Kottapalli

Upper Level Room 29A-D

[ Abstract ]

Tue 2 Dec noon PST — 3 p.m. PST

Abstract:

Modern incident root-cause analysis (RCA) is constrained by partial observability, symptom-centric signals, and the overwhelming noise present in logs, traces, and metrics. Diagnosing production failures often depends on instrumentation quality and human expertise, while latent software defects, configuration errors, and zero-day failure modes remain difficult to pinpoint. To address these challenges, we demonstrate a multi-agent system for incident diagnostics that augments observability data with application source code and static analysis signals.x000D x000D Our system introduces two cooperating agents: the Code Context Agent (COCOA), which builds a knowledge graph of program dependencies, control/data flows, and caller–callee relationships; and the Incident Diagnostics Agent (IDA), which performs agentic reasoning over an entity topology graph enriched with observability streams. Together, these agents extend topology-aware planning (TAP) to simultaneously operate on program dependency graphs and infrastructure entity graphs, thereby linking runtime symptoms with underlying code-level causes.x000D x000D This demo showcases how multi-agent collaboration enables deeper, context-sensitive RCA. We walk through real-world inspired scenarios—including incidents where critical log lines are hidden in noisy observability streams or where latent defects emerge only after system updates—illustrating how the system surfaces root causes that would otherwise remain invisible. By bridging program analysis with runtime observability, our approach moves beyond symptom-driven diagnostics toward a more reliable, automated framework for incident management.

Chat is not available.