NeurIPS 2019 Expo Demo
Dec. 8, 2019
Showcasing how extremely large language models can be trained using new AI processor technology from Graphcore,
Language model pre-training has been shown to be effective for improving many natural language processing tasks. In the last two years we have seen major new breakthroughs in extremely large langauge models including Google’s BERT - Deep Bidirectional Transformers for Language Understanding (Devlin Chang Lee Toutanova 2018) which applies the bidirectional training of Transformer, a popular attention model, to language modelling and OpenAI’s GPT2 (Radford Wu Child Luan Amodei Sutskever 2019) a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.
In this demo, the Graphcore team will showcase how recent, extremely large language models can be trained using new AI processor technology from Graphcore, the techniques employed and the state of the art results achieved, bringing further progress to this field.
The demos will take place every hour on the hour and at half past.