254 — The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

Chen, Firat & Bapna et al (1804.09849)

Read on 01 May 2018
#machine-translation  #translation  #deep-learning  #machine-learning  #neural-network  #LSTM  #RNN  #GRU  #RNMT  #RNN  #Moses  #WMT14  #group:google 

There are many challenges facing automated machine translation (MT) techniques (#79). And translation in general is a very tough nut to crack. Or, as Google Translate would say,

Translation generally severe nut crack

Despite this, we have recently seen enormous strides toward automated end-to-end MT. seq2seq, by Google, was a huge improvement upon existing CNN approaches. And then seq2seq was outperformed by the Transformer model.

In this paper, Google puts forward a series of hybrid models which mix the existing approaches to achieve:

A RNMT+ model — a recurrent neural machine translation model which uses separate decoder and encoder elements around an attention network (such as an LSTM or GRU). This method also uses a self-attention network to improve learning-by-trial.

This new model outperforms SotA on the WMT14 En→Fr and En→De datasets (scoring for BLEU score), though it requires more processing power than all other tested MT models (besides the largest Transformer net).