Poster
in
Workshop: Human in the Loop Learning (HiLL) Workshop at NeurIPS 2022

Improving Named Entity Recognition in Telephone Conversations via Effective Active Learning with Human in the Loop

Md Tahmid Rahman Laskar ⋅ Cheng Chen ⋅ Xue-Yong Fu ⋅ Shashi Bhushan

[ Poster]

Abstract

Telephone transcription data can be very noisy due to speech recognition errors,disfluencies, etc. Not only that annotating such data is very challenging for theannotators, but also such data may have lots of annotation errors even after theannotation job is completed, resulting in a very poor model performance. In thispaper, we present an active learning framework that leverages human in the looplearning to identify data samples from the annotated dataset for re-annotation thatare more likely to contain annotation errors. In this way, we largely reduce the needof data re-annotation for the whole dataset. We conduct extensive experimentswith our proposed approach for Named Entity Recognition and observe that byre-annotating only about 6% training instances out of the whole dataset, the F1score for a certain entity type can be significantly improved by about 25%.

Chat is not available.