View on GitHub

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Hindi

Chunking

Model Dev accuracy Test F1 Paper / Source Code
Dalal et al. (2006) 87.40 82.40 Hindi Part-of-Speech Tagging and Chunking: A Maximum Entropy Approach  

Part-of-speech tagging

Model Dev accuracy Test F1 Paper / Source Code
Jha et al. (2018) 99.30 99.06 Multi-Task Deep Morphological Analyzer: Context-Aware Joint Morphological Tagging and Lemma Prediction mt-dma
Dalal et al. (2006) 89.35 82.22 Hindi Part-of-Speech Tagging and Chunking: A Maximum Entropy Approach  

Machine Translation

The IIT Bombay English-Hindi Parallel Corpus used by Kunchukuttan et al. (2018) can be accessed here. A live leaderboard involving more directions involving Hindi can be accessed at the evaluation website for the Workshop on Asian Translation.

Hindi -> English

Model BLEU Paper / Source Code
Philip et al. (2020) 24.85 Revisiting Low Resource Status of Indian Languages in MT ilmulti
Siripragada et al. (2020) 22.91 A Multilingual Parallel Corpora Collection Effort for Indian Languages ilmulti
Goyal et al. (2019) 19.06 LTRC-MT Simple & Effective Hindi-English Neural Machine Translation Systems at WAT 2019  

English -> Hindi

Model BLEU Paper / Source Code
Philip et al. (2018) 21.57 CVIT-MT Systems for WAT-2018  
Philip et al. (2020) 21.20 Revisiting Low Resource Status of Indian Languages in MT ilmulti
Saini et al. (2018) 18.215 Neural Machine Translation for English to Hindi