Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning
Abstract
Human interventions at fault points can alter the diagnostic accuracy of multi-agent medical systems. We defined fault points as moments in doctor-patient conversations, where the Doctor Agent's reasoning became most vulnerable to external influence and change. Using a MedQA dataset, this study analyzed simulated doctor-patient conversations to measure how fault point interventions shifted reasoning and accuracy. Correct intervention methods showed an improvement in baseline diagnostic accuracy by as much as 40%, while incorrect or bias-related interventions degraded performance by up to 6%, and increased diagnostic drift and uncertainty. Beyond accuracy, the analysis revealed behavioral patterns between cognitive biases in simulated Medical AI and real-world clinical practice. Examples included premature closure and susceptibility to misleading cues. Overall, these findings demonstrate that strategically priming large-language models (LLMs) at their fault points can substantially enhance LLM-driven diagnostic systems, improve reliability, and reveal where interventions may introduce drift or reinforce bias.