A Taxonomy for In-depth Evaluation of Normalization for User Generated Content. - Citegraph

Paper Info

Title
A Taxonomy for In-depth Evaluation of Normalization for User Generated Content.

Abstract
In this work we present a taxonomy of error categories for lexical normalization, which is the task of translating user generated content to canonical language. We annotate a recent normalization dataset to test the practical use of the taxonomy and read a near-perfect agreement. This annotated dataset is then used to evaluate how an existing normalization model performs on the different categories of the taxonomy. The results of this evaluation reveal that some of the problematic categories only include minor transformations, whereas most regular transformations are solved quite well.

Year	Venue	Field
2018	LREC	User-generated content,Normalization (statistics),Computer science,Natural language processing,Artificial intelligence,Normalization model
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
6	3

Authors (3 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Rob van der Goot	1	7	4.21
Rik van Noord	2	16	4.73
van noord	3	684	92.89

1