View on GitHub

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Semantic parsing

Table of contents

Semantic parsing is the task of translating natural language into a formal meaning representation on which a machine can act. Representations may be an executable language such as SQL or more abstract representations such as Abstract Meaning Representation (AMR).

AMR parsing

Each AMR is a single rooted, directed graph. AMRs include PropBank semantic roles, within-sentence coreference, named entities and types, modality, negation, questions, quantities, and so on. See.

LDC2014T12:

13,051 sentences

Models are evaluated on the newswire section and the full dataset based on smatch. Systems marked with * are pipeline systems that require other systems (i.e. a dependency parse or a SRL parse) as input.

Model F1 Newswire F1 Full Paper / Source
Incremental joint model (Zhou et al., 2016)* 0.71 0.66 AMR Parsing with an Incremental Joint Model
Transition-based transducer (Wang et al., 2015)* 0.70 0.66 Boosting Transition-based AMR Parsing with Refined Actions and Auxiliary Analyzers
Imitation learning (Goodman et al., 2016)* 0.70 Noise reduction and targeted exploration in imitation learning for Abstract Meaning Representation parsing
MT-Based (Pust et al., 2015)* 0.66 Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation
Transition-based parser-Stack-LSTM (Ballesteros and Al-Onaizan, 2017)* 0.69 0.64 AMR Parsing using Stack-LSTMs
Transition-based parser-Stack-LSTM (Ballesteros and Al-Onaizan, 2017) 0.68 0.63 AMR Parsing using Stack-LSTMs

LDC2015E86:

19,572 sentences

Models are evaluated based on smatch.

Model Smatch Paper / Source
Joint model (Lyu and Titov, 2018) 73.7 AMR Parsing as Graph Prediction with Latent Alignment
Mul-BiLSTM (Foland and Martin, 2017) 70.7 Abstract Meaning Representation Parsing using LSTM Recurrent Neural Networks
JAMR (Flanigan et al., 2016) 67.0 CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss
CAMR (Wang et al., 2016) 66.5 CAMR at SemEval-2016 Task 8: An Extended Transition-based AMR Parser
AMREager (Damonte et al., 2017) 64.0 An Incremental Parser for Abstract Meaning Representation
SEQ2SEQ + 20M (Konstas et al., 2017) 62.1 Neural AMR: Sequence-to-Sequence Models for Parsing and Generation

LDC2016E25

39,260 sentences

Results are computed over 8 runs. Models are evaluated based on smatch.

Model Smatch Paper / Source
Joint model (Lyu and Titov, 2018) 74.4 AMR Parsing as Graph Prediction with Latent Alignment
ChSeq + 100K (van Noord and Bos, 2017) 71.0 Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations
Neural-Pointer (Buys and Blunsom, 2017) 61.9 Oxford at SemEval-2017 Task 9: Neural AMR Parsing with Pointer-Augmented Attention

SQL parsing

ATIS

5,280 user questions for a flight-booking task:

Example:

Question SQL query
what flights from any city land at MKE SELECT DISTINCT FLIGHTalias0.FLIGHT_ID FROM AIRPORT AS AIRPORTalias0 , AIRPORT_SERVICE AS AIRPORT_SERVICEalias0 , CITY AS CITYalias0 , FLIGHT AS FLIGHTalias0 WHERE AIRPORTalias0.AIRPORT_CODE = "MKE" AND CITYalias0.CITY_CODE = AIRPORT_SERVICEalias0.CITY_CODE AND FLIGHTalias0.FROM_AIRPORT = AIRPORT_SERVICEalias0.AIRPORT_CODE AND FLIGHTalias0.TO_AIRPORT = AIRPORTalias0.AIRPORT_CODE ;
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 51 32 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 45 17 Learning a neural semantic parser from user feedback System
Template Baseline (Finegan-Dollak et al., 2018) 45 0 Improving Text-to-SQL Evaluation Methodology Data and System

Advising

4,570 user questions about university course advising, with manually annotated SQL Finegan-Dollak et al., (2018).

Example:

Question SQL query
Can undergrads take 550 ? SELECT DISTINCT COURSEalias0.ADVISORY_REQUIREMENT , COURSEalias0.ENFORCED_REQUIREMENT , COURSEalias0.NAME FROM COURSE AS COURSEalias0 WHERE COURSEalias0.DEPARTMENT = \"department0\" AND COURSEalias0.NUMBER = 550 ;
Model Question Split Query Split Paper / Source Code
Template Baseline (Finegan-Dollak et al., 2018) 80 0 Improving Text-to-SQL Evaluation Methodology Data and System
Seq2Seq with copying (Finegan-Dollak et al., 2018) 70 0 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 41 1 Learning a neural semantic parser from user feedback System

GeoQuery

877 user questions about US geography:

Example:

Question SQL query
what is the biggest city in arizona SELECT CITYalias0.CITY_NAME FROM CITY AS CITYalias0 WHERE CITYalias0.POPULATION = ( SELECT MAX( CITYalias1.POPULATION ) FROM CITY AS CITYalias1 WHERE CITYalias1.STATE_NAME = "arizona" ) AND CITYalias0.STATE_NAME = "arizona"
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 71 20 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 66 40 Learning a neural semantic parser from user feedback System
Template Baseline (Finegan-Dollak et al., 2018) 66 0 Improving Text-to-SQL Evaluation Methodology Data and System

Scholar

817 user questions about academic publications, with automatically generated SQL that was checked by asking the user if the output was correct.

Example:

Question SQL query
What papers has sharon goldwater written ? SELECT DISTINCT WRITESalias0.PAPERID FROM AUTHOR AS AUTHORalias0 , WRITES AS WRITESalias0 WHERE AUTHORalias0.AUTHORNAME = "sharon goldwater" AND WRITESalias0.AUTHORID = AUTHORalias0.AUTHORID ;
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 59 5 Improving Text-to-SQL Evaluation Methodology Data and System
Template Baseline (Finegan-Dollak et al., 2018) 52 0 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 44 3 Learning a neural semantic parser from user feedback System

Spider

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables covering 138 different domains. In Spider 1.0, different complex SQL queries and databases appear in train and test sets.

The Spider dataset can be accessed and leaderboard can be accessed here.

WikiSQL

The WikiSQL dataset consists of 87,673 examples of questions, SQL queries, and database tables built from 26,521 tables. Train/dev/test splits are provided so that each table is only in one split. Models are evaluated based on accuracy on execute result matches.

Example:

Question SQL query
How many engine types did Val Musetti use? SELECT COUNT Engine WHERE Driver = Val Musetti
Model Acc ex Paper / Source
TypeSQL+TC (Yu et al., 2018) 82.6 TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation
SQLNet (Xu et al., 2017) 68.0 Sqlnet: Generating structured queries from natural language without reinforcement learning
Seq2SQL (Zhong et al., 2017) 59.4 Seq2sql: Generating structured queries from natural language using reinforcement learning

Smaller Datasets

Restaurants - 378 questions about restaurants, their cuisine and locations, collected by Tang and Mooney (2000), converted to SQL by [Popescu et al., (2003)]((http://doi.acm.org/10.1145/604045.604070) and Giordani and Moschitti (2012), improved and converted to canonical style by Finegan-Dollak et al., (2018)

Example:

Question SQL query
where is a restaurant in alameda ? SELECT LOCATIONalias0.HOUSE_NUMBER , RESTAURANTalias0.NAME FROM LOCATION AS LOCATIONalias0 , RESTAURANT AS RESTAURANTalias0 WHERE LOCATIONalias0.CITY_NAME = "alameda" AND RESTAURANTalias0.ID = LOCATIONalias0.RESTAURANT_ID ;
Model Question Split Query Split Paper / Source Code
Iyer et al., (2017) 100 8 Learning a neural semantic parser from user feedback System
Seq2Seq with copying (Finegan-Dollak et al., 2018) 100 4 Improving Text-to-SQL Evaluation Methodology Data and System
Template Baseline (Finegan-Dollak et al., 2018) 95 0 Improving Text-to-SQL Evaluation Methodology Data and System

Academic - 196 questions about publications generated by enumerating all of the different queries possible with the Microsoft Academic Search interface, then writing questions for each query Li and Jagadish (2014). Improved and converted to a cononical style by Finegan-Dollak et al., (2018).

Example:

Question SQL query
return me the homepage of PVLDB SELECT JOURNALalias0.HOMEPAGE FROM JOURNAL AS JOURNALalias0 WHERE JOURNALalias0.NAME = "PVLDB" ;
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 81 74 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 76 70 Learning a neural semantic parser from user feedback System
Template Baseline (Finegan-Dollak et al., 2018) 0 0 Improving Text-to-SQL Evaluation Methodology Data and System

Yelp - 128 user questions about the Yelp website Yaghmazadeh et al., 2017. Improved and converted to a cononical style by Finegan-Dollak et al., (2018).

Example:

Question SQL query
List all businesses with rating 3.5 SELECT BUSINESSalias0.NAME FROM BUSINESS AS BUSINESSalias0 WHERE BUSINESSalias0.RATING = 3.5 ;
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 12 4 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 6 6 Learning a neural semantic parser from user feedback System
Template Baseline (Finegan-Dollak et al., 2018) 1 0 Improving Text-to-SQL Evaluation Methodology Data and System

IMDB - 131 user questions about the Internet Movie Database Yaghmazadeh et al., 2017. Improved and converted to a cononical style by Finegan-Dollak et al., (2018).

Example:

Question SQL query
What year was the movie “ The Imitation Game “ produced SELECT MOVIEalias0.RELEASE_YEAR FROM MOVIE AS MOVIEalias0 WHERE MOVIEalias0.TITLE = "The Imitation Game" ;
Model Question Split Query Split Paper / Source Code
Seq2Seq with copying (Finegan-Dollak et al., 2018) 26 9 Improving Text-to-SQL Evaluation Methodology Data and System
Iyer et al., (2017) 10 4 Learning a neural semantic parser from user feedback System
Template Baseline (Finegan-Dollak et al., 2018) 0 0 Improving Text-to-SQL Evaluation Methodology Data and System

Go back to the README