Surgical Information Assistant: an agentic information retrieval system for surgical information and a benchmark dataset
Kiran Bhattacharyya
Abstract
We present the Surgical Information Assistant, an agentic retrieval-augmented generation (RAG) system designed to improve access to surgical knowledge in resource-constrained settings. Built on the Open Manual of Surgery for Resource-Limited Settings, the assistant uses a retrieval-method we call DeRetSyn (Decompose–Retrieve–Synthesize). We evaluate DeRetSyn using automated metrics and partial human validation across 14,500 synthesized question–answer pairs and find that it achieves 63\% top-1 accuracy using a 3B Llama model -- outperforming GPT-4o (42.5\%) without RAG and a 8B Llama model with conventional RAG ($\simeq$53\%) while being significantly smaller and more computationally efficient. We also find that the DeRetSyn system with the Llama 3B model outperforms GPT-4o on the publicly available PubMedQA dataset on overall accuracy under specific prompting patterns. The Surgical Information Assistant demonstrates how agentic orchestration can extend the capabilities of small language models and offers a deployable framework for point-of-care medical decision support, education, and QA in low-bandwidth environments. We plan to release our benchmark dataset, codebase, prompt library, and RAG evaluation results for all categories for the entire dataset along with chain-of-thought reasoning from GPT-4o, Llama-3.1-8B, and Llama-3.2-3B upon publication.
Chat is not available.
Successful Page Load