Affinity Workshop: Women in Machine Learning

Reduce False Negative in Distant supervised learning using Dependency tree-LSTM to Construct a Knowledge Graph

Samira Korani


Knowledge Graphs(KG) are a fundamental part of a wide variety of NLP applications improving the accuracy and explainability of knowledge. Constructing Knowledge Graphs from unstructured text assists the entity detection task and extracting semantic relations. A relation called a triple is the smallest part of the knowledge graph. A triple includes the subject of the relation; and the object of the relation, relation. However, extracting semantic relations has many difficulties.Supervised RE requires huge amounts of labelled data, which is labour-intensive and time-consuming. Some studies offered Distant supervision (DS). This method generates KG triplets based on the co-occurrence of entities in a sentence. In other words, any sentence containing an entity pair expresses relation. However, these methods struggle to obtain high-quality relations, suffering from False Negatives and False Positives. In our paper, we used a new Encoder-decoder model and multilayer perceptron to detect FN in two popular DS datasets (NYT10, GIDS); the possible FN is unlabeled and a model using Tree Bi-LSTM was trained to allocate new labels to improve previous research results. To summarise, our core contributions are: Construct an Encoder based on entity importance in the Distantly RE dataset. A model to Detect False negatives. Develop an algorithm to predict relation using a combination of dependency tree and tree Bi-LSTM.The result is a significant contribution and in comparison to models has a 25% improvement. False negative Detector filter FN samples from N with logits larger than threshold θ. The model discovered 6,324 FN samples from NYT10, which refer to 4,153 entity pairs; and 324 FN samples from GIDS, which refer to 285 entity pairs. The average precision is 92.0,For further research, we try to reduce FP in distant supervised learning.

Chat is not available.