Title
Comparison of Methods to Annotate Named Entity Corpora.
Abstract
The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, degree of agreement, and performance were evaluated based on the gold standard. Because there were two annotators for one text for each method, two performances were evaluated: the average performance of both annotators and the performance when at least one annotator is correct. The experiments reveal that semi-automatic annotation is faster, achieves better agreement, and performs better on average. However, they also indicate that sometimes, fully manual annotation should be used for some texts whose document types are substantially different from the training data document types. In addition, the machine learning experiments using semi-automatic and fully manually annotated corpora as training data indicate that the F-measures could be better for some texts when manual instead of semi-automatic annotation was used. Finally, experiments using the annotated corpora for training as additional corpora show that (i) the NE recognition performance does not always correspond to the performance of the NE tag annotation and (ii) the system trained with the manually annotated corpus outperforms the system trained with the semi-automatically annotated corpus with respect to newswires, even though the existing NE recognizer was mainly trained with newswires.
Year
DOI
Venue
2018
10.1145/3218820
ACM Trans. Asian & Low-Resource Lang. Inf. Process.
Keywords
Field
DocType
Annotation, named entity extraction, non-expert annotator
Training set,Annotation,Computer science,Manual annotation,Named entity,Natural language processing,Artificial intelligence
Journal
Volume
Issue
ISSN
17
4
2375-4699
Citations 
PageRank 
References 
0
0.34
11
Authors
5
Name
Order
Citations
PageRank
Kanako Komiya1319.59
Masaya Suzuki222.10
Tomoya Iwakura36815.95
Minoru Sasaki4416.44
Hiroyuki Shinnou510730.12