LLM-Integrated Representative Path Selection for Context-Aware Drug Repurposing on Biomedical Knowledge Graphs
Abstract
Drug repurposing, which seeks novel drug–disease associations by integrating biomedical knowledge, faces challenges in modeling complex multi-hop relationships in knowledge graphs. We propose DrugCORpath, an approach that integrates biomedical knowledge graphs with pretrained biomedical large language models (LLMs). Unlike methods that learn isolated node representations, LLM integration captures biological context by embedding path sentences from multi-hop drug–disease connections. Each path is converted into biological path sentences reflecting plausible mechanisms of action (MoAs), enabling the model to capture rich semantic relationships among entities. We employ selective filtering with KMeans clustering and distance metrics to retain meaningful paths while removing redundancy and noise. Experiments show DrugCORpath outperforms graph-based, LLM-based, and path-based baselines, achieving up to 4.9% higher accuracy than the prior SOTA. Further analysis confirms that path filtering reduces noise and enhances biological diversity, and case studies validate the clinical relevance of the selected paths, underscoring the method’s potential for interpretable, biologically plausible drug repurposing.