Skip to yearly menu bar Skip to main content

Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)

Dissecting In-Context Learning of Translations

Vikas Raunak · Arul Menezes · Hany Awadalla


Most of the recent work in leveraging Large Language Models (LLMs) such as GPT-3 for Machine Translation (MT) through in-context learning of translations has focused on selecting the few-shot demonstration samples. In this work, we characterize the robustness of LLMs from the GPT family to certain perturbations on few-shot translation demonstrations as a means to dissect the in-context learning of translations. In particular, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations. We find that asymmetric perturbation of the source-target mappings yield vastly different results. Further, we show that the perturbation of the source side has surprisingly little impact, while target perturbation can drastically reduce translation quality, suggesting that it is the output text distribution that provides the most important learning signal during in-context learning of translations. Based on our findings, we propose a method named Zero-Shot-Context to add this signal automatically in Zero-Shot prompting. Our proposed method greatly improves upon the zero-shot translation performance of GPT-3, thereby making it competitive with few-shot prompted translations.

Chat is not available.