Neural Machine Translation with Word Predictions (EMNLP 2017)

I am generally a fan of papers that explicitly encode linguistic insights into neural models. Learning the initial state has already been recommended by Hinton. Intuitively, the initial state should provide a good starting position for our model's predictions. For NMT, this implies that the hidden state should already contain information that will allow the model to predict the words in the sentence. Weng et al. achieve this by training the hidden state to predict the words in the sentence. A simple method with convincing results.


Want to receive more content like this in your inbox?