GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. - Citegraph

Paper Info

Title
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding.

Abstract
For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.

Year	Venue	DocType
2018	international conference on learning representations	Conference
Volume	Citations	PageRank
abs/1804.07461	58	1.33
References	Authors
39	6

Authors (6 rows)

Cited by (58 rows)

References (39 rows)

Name	Order	Citations	PageRank
Alex Wang	1	71	5.27
Amanpreet Singh	2	109	8.34
julian michael	3	78	5.08
Felix Hill	4	346	17.90
Omer Levy	5	1387	56.96
Samuel R. Bowman	6	906	44.99

1