Timezone: »
Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug design and repurposing. Recent work has shown that general-domain language models (LMs) can serve as "soft" KGs, and that they can be fine-tuned for the task of KG completion. In this work, we study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We evaluate several domain-specific LMs, fine-tuning them on datasets centered on drugs and diseases that we represent as KGs and enrich with textual entity descriptions. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance. Finally, we demonstrate the advantage of LM models in the inductive setting with novel scientific entities. Our datasets and code are made publicly available.
Author Information
Rahul Nadkarni (University of Washington)
David Wadden (Department of Computer Science, University of Washington)
Iz Beltagy (Allen Institute for AI)
Noah Smith (University of Washington)
Hanna Hajishirzi (University of Washington)
Tom Hope (Allen Institute for Artificial Intelligence)
More from the Same Authors
-
2021 : NaturalProofs: Mathematical Theorem Proving in Natural Language »
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho -
2021 : Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery »
Jason Portenoy · Jevin West · Eric Horvitz · Daniel Weld · Tom Hope -
2021 : A Search Engine for Discovery of Scientific Challenges and Directions »
Dan Lahav · Jon Saad-Falcon · Duen Horng Chau · Diyi Yang · Eric Horvitz · Daniel Weld · Tom Hope -
2021 : Understanding and Knowledge Extraction from Mathematical and Scientific Text »
Hanna Hajishirzi -
2021 : NaturalProofs: Mathematical Theorem Proving in Natural Language »
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho -
2021 Poster: FLEX: Unifying Evaluation for Few-Shot NLP »
Jonathan Bragg · Arman Cohan · Kyle Lo · Iz Beltagy -
2021 Poster: One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval »
Akari Asai · Xinyan Yu · Jungo Kasai · Hanna Hajishirzi