INTRODUCTION Word Imputation is about finding and imputing missing words. The task was first proposed by Kaggle in their Billion Word Imputation competition. The model proposed in this demo takes a more challenging task by trying to impute multiple missing words, as opposed to a single word, or add words to a complete sentence, producing a more complex one. The later is called sentence expansion.
OBJECTIVE * Given an incomplete sentence, find the location of the missing words in the sentence and impute them * Given a complete sentence, find the location where the sentence can be improved and impute the words required to improved.
METHOD The model is composed of an encoder-decoder network with two learning objectives. One is to find the location of the missing words solved as a binary sequence classification task, and the other is to generate the sequence of the missing words. The embedding of the end-of-sequence in the decoder is computed dynamically as a function of the hidden state of the encoder. The model also employs an RNN-based language model as a scorer in the beam search algorithm to efficiently generate linguistically correct sequences of words.
DEMO The demo is a mobile-friendly interactive web application, where users get to type a sentence and the model will list the top N predictions for completing or expanding the sentence. To add an engagement element, there will be a fun challenge, before presenting the app, where each one of the audience come up with a sentence from poetry, quotes ..., that has missing words. These sentences will be swapped around to be completed by the audience and the model. Finally, the answers will be presented to compare the creativity of the audience vs the model.
Osman Ramadan (Microsoft)
R&D Software Engineer at SwiftKey, Microsoft
Douglas Orr (Microsoft)
Dmitry Stratiychuk (Microsoft)
Software Engineer at Microsoft SwiftKey