View on GitHub

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Grammatical Error Correction

Grammatical Error Correction (GEC) is the task of correcting grammatical mistakes in a sentence.

Error Corrected
She see Tom is catched by policeman in park at last night. She saw Tom caught by a policeman in the park last night.

CoNLL-2014

CoNLL-14 benchmark is done on the test split of NUS Corpus of Learner English/NUCLE dataset. CoNLL-2014 test set contains 1,312 english sentences with grammatical error correction annotations by 2 annotators. Models are evaluated with F-score with β=0.5 which weighs precision twice as recall.

Model F0.5 Paper / Source Code
CNN Seq2Seq + Fluency Boost (Ge et al., 2018) 61.34 Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study NA
SMT + BiGRU (Grundkiewicz et al., 2018) 56.25 Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation NA
Transformer (Junczys-Dowmunt et al., 2018) 55.8 Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task NA
CNN Seq2Seq (Chollampatt & Ng, 2018) 54.79 A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction Official

CoNLL-2014 10 Annotators

Bryant and Ng 2015 used 10 annotators to do grammatical error correction on CoNll-14’s 1312 sentences.

Model F0.5 Paper / Source Code
CNN Seq2Seq + Fluency Boost (Ge et al., 2018) 76.88 Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study NA
SMT + BiGRU (Grundkiewicz et al., 2018) 72.04 Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation NA
CNN Seq2Seq (Chollampatt & Ng, 2018) 70.14 (measured by Ge et al., 2018) A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction Official

JFLEG

JFLEG corpus by Napoles et al., 2017 consists of 1,511 english sentences with annotations. Models are evaluated with GLEU metric.

Model GLEU Paper / Source Code
CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) 62.37 Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study NA
SMT + BiGRU (Grundkiewicz et al., 2018) 61.50 Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation NA
Transformer (Junczys-Dowmunt et al., 2018) 59.9 Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task NA
CNN Seq2Seq (Chollampatt & Ng, 2018) 57.47 A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction Official