Dilated Recurrent Neural Networks (NIPS 2017)


RNNs and LSTMs are common building blocks for Deep Learning-based NLP models, but have three weaknesses: 1) extracting complex (long-term) dependencies; 2) vanishing/exploding gradients; and 3) efficient parallelization. Chang et al. propose a model that is similar to WaveNet, but for RNNs: They introduce dilated skip-connections (that skip intermediate states) and remove the dependency on the previous layer, which enables the RNN to be parallelized. The resulting DilatedRNN outperforms many more sophisticated architectures across different tasks.


Want to receive more content like this in your inbox?