Timezone: »
We introduce Data Diversification: a simple but effective strategy to boost neural machine translation (NMT) performance. It diversifies the training data by using the predictions of multiple forward and backward models and then merging them with the original dataset on which the final NMT model is trained. Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles of models. Our method achieves state-of-the-art BLEU scores of 30.7 and 43.7 in the WMT'14 English-German and English-French translation tasks, respectively. It also substantially improves on 8 other translation tasks: 4 IWSLT tasks (English-German and English-French) and 4 low-resource translation tasks (English-Nepali and English-Sinhala). We demonstrate that our method is more effective than knowledge distillation and dual learning, it exhibits strong correlation with ensembles of models, and it trades perplexity off for better BLEU score.
Author Information
Xuan-Phi Nguyen (Nanyang Technological University)
Shafiq Joty (Nanyang Technological University)
Kui Wu (Institute for Infocomm Research, Singapore)
Ai Ti Aw (Institute for Infocomm Research)
More from the Same Authors
-
2021 Spotlight: Align before Fuse: Vision and Language Representation Learning with Momentum Distillation »
Junnan Li · Ramprasaath Selvaraju · Akhilesh Gotmare · Shafiq Joty · Caiming Xiong · Steven Chu Hong Hoi -
2022 Poster: Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Translation Model »
Xuan-Phi Nguyen · Shafiq Joty · Kui Wu · Ai Ti Aw -
2021 Poster: Align before Fuse: Vision and Language Representation Learning with Momentum Distillation »
Junnan Li · Ramprasaath Selvaraju · Akhilesh Gotmare · Shafiq Joty · Caiming Xiong · Steven Chu Hong Hoi -
2020 Poster: Self-Supervised Relationship Probing »
Jiuxiang Gu · Jason Kuen · Shafiq Joty · Jianfei Cai · Vlad I. Morariu · Handong Zhao · Tong Sun