DisSent: Sentence Representation Learning from Explicit Discourse Relations (arXiv)


In recent years, we've seen different ways of learning sentence embeddings: Some are completely unsupervised, others depend on paraphrases or leverage a certain task, e.g. entailment. Nie et al. learn sentence embeddings by predicting discourse markers in a corpus. They automatically create a large training set for this task by collecting sentence pairs from the BookCorpus that are connected with frequent discourse markers. The resulting embeddings complement recently proposed embeddings leveraging entailment.


