Poster
in
Workshop: Synthetic Data for Empowering ML Research

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Maximillian Chen ⋅ Alexandros Papangelis ⋅ Chenyang Tao ⋅ Andy Rosenbaum ⋅ Seokhwan Kim ⋅ Yang Liu ⋅ Zhou Yu ⋅ Dilek Hakkani-Tur

2022 Poster
in
Workshop: Synthetic Data for Empowering ML Research

Project Page [ OpenReview]

Abstract

Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters.We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. Models fine-tuned on our augmented data mixed with few-shot ground truth data are able to approach or surpass existing state-of-the-art performance on both datasets. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.

Video

Chat is not available.