Title
From annotator agreement to noise models
Abstract
This article discusses the transition from annotated data to a gold standard, that is, a subset that is sufficiently noise-free with high confidence. Unless appropriately reinterpreted, agreement coefficients do not indicate the quality of the data set as a benchmarking resource: High overall agreement is neither sufficient nor necessary to distill some amount of highly reliable data from the annotated material. A mathematical framework is developed that allows estimation of the noise level of the agreed subset of annotated data, which helps promote cautious benchmarking.
Year
DOI
Venue
2009
10.1162/coli.2009.35.4.35402
Computational Linguistics
Keywords
Field
DocType
high overall agreement,agreement coefficient,high confidence,gold standard,cautious benchmarking,reliable data,benchmarking resource,annotator agreement,annotated data,annotated material,mathematical framework,noise model
Data mining,Computer science,Noise level,Benchmarking
Journal
Volume
Issue
ISSN
35
4
0891-2017
Citations 
PageRank 
References 
27
1.41
19
Authors
2
Name
Order
Citations
PageRank
Beata Beigman Klebanov113719.49
Eyal Beigman21089.70