DisSent: Sentence Representation Learning from Explicit Discourse Relations (arXiv)


In recent years, we've seen different ways of learning sentence embeddings: Some are completely unsupervised, others depend on paraphrases or leverage a certain task, e.g. entailment. Nie et al. learn sentence embeddings by predicting discourse markers in a corpus. They automatically create a large training set for this task by collecting sentence pairs from the BookCorpus that are connected with frequent discourse markers. The resulting embeddings complement recently proposed embeddings leveraging entailment.


Want to receive more content like this in your inbox?