View on GitHub

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Constituency parsing

Consituency parsing aims to extract a constituency-based parse tree from a sentence that represents its syntactic structure according to a phrase structure grammar.

Example:

             Sentence (S)
                 |
   +-------------+------------+
   |                          |
 Noun (N)                Verb Phrase (VP)
   |                          |
 John                 +-------+--------+
                      |                |
                    Verb (V)         Noun (N)
                      |                |
                    sees              Bill

Recent approaches convert the parse tree into a sequence following a depth-first traversal in order to be able to apply sequence-to-sequence models to it. The linearized version of the above parse tree looks as follows: (S (N) (VP V N)).

Penn Treebank

The Wall Street Journal section of the Penn Treebank is used for evaluating constituency parsers. Section 22 is used for development and Section 23 is used for evaluation. Models are evaluated based on F1. Most of the below models incorporate external data or features. For a comparison of single models trained only on WSJ, refer to Kitaev and Klein (2018).

Model F1 score Paper / Source Code
Self-attentive encoder + ELMo by Kitaev and Klein (2018) 95.13 Constituency Parsing with a Self-Attentive Encoder
Model combination by Fried et al. (2017) 94.66 Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
In-order by Liu and Zhang (2017) 94.2 In-Order Transition-based Constituent Parsing
Semi-supervised LSTM-LM by Choe and Charniak (2016) 93.8 Parsing as Language Modeling
Stack-only RNNG by Kuncoro et al. (2017) 93.6 What Do Recurrent Neural Network Grammars Learn About Syntax?
RNN Grammar by Dyer et al. (2016) 93.3 Recurrent Neural Network Grammars
Transformer by Vaswani et al. (2017) 92.7 Attention Is All You Need
Semi-supervised LSTM by Vinyals et al. (2015) 92.1 Grammar as a Foreign Language
Self-trained parser by McClosky et al. (2006) 92.1 Effective Self-Training for Parsing

Kitaev and Klein (2018)

95.13

Fried et al. (2017)

94.66

Liu and Zhang (2017)

94.2

Choe and Charniak (2016)

93.8

Kuncoro et al. (2017)

93.6

Dyer et al. (2016)

93.3

Vaswani et al. (2017)

92.7

Vinyals et al. (2015)

92.1

McClosky et al. (2006)

92.1

Go back to the README