Multi-task learning aims to learn multiple different tasks simultaneously while maximizing performance on one or all of the tasks.
The Natural Language Decathlon (decaNLP) is a benchmark for studying general NLP models that can perform a variety of complex, natural language tasks. It evaluates performance on ten disparate natural language tasks.
Results can be seen on the public leaderboard.
The General Language Understanding Evaluation benchmark (GLUE) is a tool for evaluating and analyzing the performance of models across a diverse range of existing natural language understanding tasks. Models are evaluated based on their average accuracy across all tasks.
The state-of-the-art results can be seen on the public GLUE leaderboard.