Advances in Pre-Training Distributed Word Representations

Facebook AI Research has over the last two years been focusing on making word embeddings more efficient with fastText. This new paper does propose anything novel but uses a combination known tricks that are however rarely used together: position-dependent vectors (to reweight the word embeddings), phrase embeddings, and subword embeddings. In addition, they use lots of data, which allows them to achieve a new state-of-the-art across intrinsic tasks and on SQuAD. As an added bonus, they provide they make the new pre-trained embeddings available.


Want to receive more content like this in your inbox?