Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.
A standard dataset for POS tagging is the Wall Street Journal (WSJ) portion of the Penn Treebank, containing 45 different POS tags. Sections 0-18 are used for training, sections 19-21 for development, and sections 22-24 for testing. Models are evaluated based on accuracy.
The Ritter (2011) dataset has become the benchmark for social media part-of-speech tagging. This is comprised of some 50K tokens of English social media sampled in late 2011, and is tagged using an extended version of the PTB tagset.
|GATE||88.69||Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data|
|CMU||90.0 ± 0.5||Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters|
Universal Dependencies (UD) is a framework for cross-linguistic grammatical annotation, which contains more than 100 treebanks in over 60 languages. Models are typically evaluated based on the average test accuracy across 21 high-resource languages (♦ evaluated on 17 languages).
|Model||Avg accuracy||Paper / Source|
|Multilingual BERT and BPEmb (Heinzerling and Strube, 2019)||96.77||Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation|
|Adversarial Bi-LSTM (Yasunaga et al., 2018)||96.65||Robust Multilingual Part-of-Speech Tagging via Adversarial Training|
|MultiBPEmb (Heinzerling and Strube, 2019)||96.62||Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation|
|Bi-LSTM (Plank et al., 2016)||96.40||Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss|
|Joint Bi-LSTM (Nguyen et al., 2017)♦||95.55||A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing|