[1609.07843] Pointer Sentinel Mixture Models

arxiv.org

The Pointer sentinel mixture architecture for neural sequence models has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. It achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using fewer parameters than a standard softmax LSTM. By Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher (MetaMind)

Read more...
Linkedin

Want to receive more content like this in your inbox?