Evolving Domain Adaptation of Pretrained Language Models for Text Classification
Yun-Shiuan Chuang · Rheeya Uppaal · Yi Wu · Luhang Sun · Makesh Narsimhan Sreedhar · Sijia Yang · Timothy T Rogers · Junjie Hu
Keywords:
Time-Series Data
Evolving Domain Adaptation
Continual Domain Adaptation
Pre-trained Language Model
unsupervised domain adaptation
Semi-Supervised Learning
Abstract
Pre-trained language models have shown impressive performance in various text classification tasks. However, the performance of these models is highly dependent on the quality and domain of the labeled examples. In dynamic real-world environments, text data content naturally evolves over time, leading to a natural $\textit{evolving domain shift}$. Over time, this continuous temporal shift impairs the performance of static models, as their training becomes increasingly outdated.To address this issue, we propose two dynamic buffer-based adaptation strategies: one utilizes self-training with pseudo-labeling, and the other employs a tuning-free, in-context learning approach for large language models (LLMs).We validate our methods with extensive experiments on two longitudinal real-world social media datasets, demonstrating their superiority compared to unadapted baselines. Furthermore, we introduce a COVID-19 vaccination stance detection dataset, serving as a benchmark for evaluating pre-trained language models within evolving domain adaptation settings.
Video
Chat is not available.
Successful Page Load