Multimodal KB Extraction and Completion
in
Workshop: 6th Workshop on Automated Knowledge Base Construction (AKBC)
Abstract
Existing pipelines for constructing KBs primarily support a restricted set of data types, such as focusing on the text of the documents when extracting information, ignoring the various modalities of evidence that we regularly encounter, such as images, semi-structured tables, video, and audio. Similarly, approaches that reason over incomplete and uncertain KBs are limited to basic entity-relation graphs, ignoring the diversity of data types that are useful for relational reasoning, such as text, images, and numerical attributes. In this work, we present a novel AKBC pipeline that takes the first steps in combining textual and relational evidence with other sources like numerical, image, and tabular data. We focus on two tasks: single entity attribute extraction from documents and relational knowledge graph completion. For each, we introduce new datasets that contain multimodal information, propose benchmark evaluations, and develop models that build upon advances in deep neural encoders for different data types.