Abstract | ||
---|---|---|
A number of recent articles in computational linguistics venues called for a closer examination of the type of noise present in annotated datasets used for benchmarking (Reidsma and Carletta, 2008; Beigman Klebanov and Beigman, 2009). In particular, Beigman Klebanov and Beigman articulated a type of noise they call annotation noise and showed that in worst case such noise can severely degrade the generalization ability of a linear classifier (Beigman and Beigman Klebanov, 2009). In this paper, we provide quantitative empirical evidence for the existence of this type of noise in a recently benchmarked dataset. The proposed methodology can be used to zero in on unreliable instances, facilitating generation of cleaner gold standards for benchmarking. |
Year | Venue | Keywords |
---|---|---|
2010 | HLT-NAACL | annotation noise,generalization ability,linear classifier,computational linguistics venue,closer examination,benchmarked dataset,annotated datasets,noise present,cleaner gold standard,beigman klebanov,empirical evidence |
Field | DocType | ISBN |
Data mining,Annotation,Empirical evidence,Computer science,Computational linguistics,Artificial intelligence,Linear classifier,Benchmarking,Machine learning | Conference | 1-932432-65-5 |
Citations | PageRank | References |
2 | 0.44 | 12 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Beata Beigman Klebanov | 1 | 137 | 19.49 |
Eyal Beigman | 2 | 108 | 9.70 |