[1707.05589] On the State of the Art of Evaluation in Neural Language Models


In recent years, more and more complex models have been used to achieve state-of-the-art results in language modelling on The Penn Tree Bank, the MNIST of NLP. Melis et al. show that a carefully tuned LSTM with current best practices, e.g. with weight-tying, recurrent dropout, down-projection, etc. outperforms all more complex models. The takeaway: LSTMs are here to stay (if that wasn't clear so far); tune your hyperparameters if you care about state-of-the-art.


Want to receive more content like this in your inbox?