Title
Some empirical evidence for annotation noise in a benchmarked dataset
Abstract
A number of recent articles in computational linguistics venues called for a closer examination of the type of noise present in annotated datasets used for benchmarking (Reidsma and Carletta, 2008; Beigman Klebanov and Beigman, 2009). In particular, Beigman Klebanov and Beigman articulated a type of noise they call annotation noise and showed that in worst case such noise can severely degrade the generalization ability of a linear classifier (Beigman and Beigman Klebanov, 2009). In this paper, we provide quantitative empirical evidence for the existence of this type of noise in a recently benchmarked dataset. The proposed methodology can be used to zero in on unreliable instances, facilitating generation of cleaner gold standards for benchmarking.
Year
Venue
Keywords
2010
HLT-NAACL
annotation noise,generalization ability,linear classifier,computational linguistics venue,closer examination,benchmarked dataset,annotated datasets,noise present,cleaner gold standard,beigman klebanov,empirical evidence
Field
DocType
ISBN
Data mining,Annotation,Empirical evidence,Computer science,Computational linguistics,Artificial intelligence,Linear classifier,Benchmarking,Machine learning
Conference
1-932432-65-5
Citations 
PageRank 
References 
2
0.44
12
Authors
2
Name
Order
Citations
PageRank
Beata Beigman Klebanov113719.49
Eyal Beigman21089.70