Capturing Dependency Syntax with "Deep" Sequential Models

Yoav Goldberg's invited talk at DepLing 2017. He discusses whether LSTMs can learn hierarchy, e.g. verb agreement (they can, but only if explicitly supervised) and gives an overview of easy-first parsing with BiLSTM representations. The main takeaway: The best parsers in the world are based on 1st-order decomposition over a BiLSTM. We still do not understand very well why this simple model works so well and what these vectors actually encode.


