NeurIPS Pre-training large-scale language models aware of social interaction's at Twitter

Keynote
in
Affinity Workshop: LatinX in AI

Pre-training large-scale language models aware of social interaction's at Twitter

Omar Florez

[ Abstract ]

Abstract:

In this talk, we discuss the adaptation of non-parametric retrieval models to evolving online conversations. We demonstrate that a static neural encoder can simply replace datastores with up-to-date information to accommodate adaptation and deletion without degradation. Modern deep learning frameworks can achieve these goals by fine-tuning at regular time intervals, but require a great computational budget. Our best non-parametric approach consistently outperforms parametric models (BART’s encoder and sequence-to-sequence models) over the course of a year (48 weeks) with an average relative gain of 64.12% recall when the test distribution shifts and outperforms fine-tuned models with an average relative gain of 11.58% recall. Our empirical analysis highlights non-parametric techniques as a practical and promising direction for adaptation to distribution shifts, and may facilitate future work arising from temporality in real-world deployment of NLP systems that require minimal computational costs.

Chat is not available.

Keynote in Affinity Workshop: LatinX in AI

Pre-training large-scale language models aware of social interaction's at Twitter

Omar Florez

Keynote
in
Affinity Workshop: LatinX in AI