Machine translation is the task of translating a sentence in a source language to a different target language.
Results with a * indicate that the mean test score over the the best window based on average dev-set BLEU score over 21 consecutive evaluations is reported as in Chen et al. (2018).
WMT 2014 EN-DE
Models are evaluated on the English-German dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.
WMT 2014 EN-FR
Similarly, models are evaluated on the English-French dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.
|Model||BLEU||Paper / Source|
|DeepL||45.9||DeepL Press release|
|Transformer Big + BT (Edunov et al., 2018)||45.6||Understanding Back-Translation at Scale|
|MUSE (Zhao et al., 2019)||43.5||MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning|
|TaLK Convolutions (Lioutas et al., 2020)||43.2||Time-aware Large Kernel Convolutions|
|DynamicConv (Wu et al., 2019)||43.2||Pay Less Attention With Lightweight and Dynamic Convolutions|
|Transformer Big (Ott et al., 2018)||43.2||Scaling Neural Machine Translation|
|RNMT+ (Chen et al., 2018)||41.0*||The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation|
|Transformer Big (Vaswani et al., 2017)||41.0||Attention Is All You Need|
|MoE (Shazeer et al., 2017)||40.56||Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer|
|ConvS2S (Gehring et al., 2017)||40.46||Convolutional Sequence to Sequence Learning|
|Transformer Base (Vaswani et al., 2017)||38.1||Attention Is All You Need|