NeurIPS 2019 Expo Talk

Dec. 8, 2019

Expo 2019 Schedule »

Innovative approaches in training large scale language models

Sponsor: Graphcore

Organizers:
Tom Wilson (Graphcore)

Presenters:
Tom Wilson (Graphcore)

Abstract:

Language model pre-training has been shown to be effective for improving many natural language processing tasks. In the last two years we have seen major new breakthroughs in extremely large langauge models including Google’s BERT - Deep Bidirectional Transformers for Language Understanding (Devlin Chang Lee Toutanova 2018) which applies the bidirectional training of Transformer, a popular attention model, to language modelling and OpenAI’s GPT2 (Radford Wu Child Luan Amodei Sutskever 2019) a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.

In this talk, Graphcore VP Tom Wilson will explain how recent, extremely large language models can be trained using new AI processor technology from Graphcore, the techniques employed and the state of the art results achieved, bringing further progress to this field.