Skip to yearly menu bar Skip to main content

Workshop: Workshop on Distribution Shifts: New Frontiers with Foundation Models

Evolving Domain Adaptation of Pretrained Language Models for Text Classification

Yun-Shiuan Chuang · Rheeya Uppaal · Yi Wu · Luhang Sun · Makesh Narsimhan Sreedhar · Sijia Yang · Timothy T Rogers · Junjie Hu

Keywords: [ Time-Series Data ] [ Evolving Domain Adaptation ] [ Continual Domain Adaptation ] [ Pre-trained Language Model ] [ unsupervised domain adaptation ] [ Semi-Supervised Learning ]

Abstract: Pre-trained language models have shown impressive performance in various text classification tasks. However, the performance of these models is highly dependent on the quality and domain of the labeled examples. In dynamic real-world environments, text data content naturally evolves over time, leading to a natural $\textit{evolving domain shift}$. Over time, this continuous temporal shift impairs the performance of static models, as their training becomes increasingly outdated.To address this issue, we propose two dynamic buffer-based adaptation strategies: one utilizes self-training with pseudo-labeling, and the other employs a tuning-free, in-context learning approach for large language models (LLMs).We validate our methods with extensive experiments on two longitudinal real-world social media datasets, demonstrating their superiority compared to unadapted baselines. Furthermore, we introduce a COVID-19 vaccination stance detection dataset, serving as a benchmark for evaluating pre-trained language models within evolving domain adaptation settings.

Chat is not available.