Abstract | ||
---|---|---|
The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, degree of agreement, and performance were evaluated based on the gold standard. Because there were two annotators for one text for each method, two performances were evaluated: the average performance of both annotators and the performance when at least one annotator is correct. The experiments reveal that semi-automatic annotation is faster, achieves better agreement, and performs better on average. However, they also indicate that sometimes, fully manual annotation should be used for some texts whose document types are substantially different from the training data document types. In addition, the machine learning experiments using semi-automatic and fully manually annotated corpora as training data indicate that the F-measures could be better for some texts when manual instead of semi-automatic annotation was used. Finally, experiments using the annotated corpora for training as additional corpora show that (i) the NE recognition performance does not always correspond to the performance of the NE tag annotation and (ii) the system trained with the manually annotated corpus outperforms the system trained with the semi-automatically annotated corpus with respect to newswires, even though the existing NE recognizer was mainly trained with newswires.
|
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/3218820 | ACM Trans. Asian & Low-Resource Lang. Inf. Process. |
Keywords | Field | DocType |
Annotation, named entity extraction, non-expert annotator | Training set,Annotation,Computer science,Manual annotation,Named entity,Natural language processing,Artificial intelligence | Journal |
Volume | Issue | ISSN |
17 | 4 | 2375-4699 |
Citations | PageRank | References |
0 | 0.34 | 11 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kanako Komiya | 1 | 31 | 9.59 |
Masaya Suzuki | 2 | 2 | 2.10 |
Tomoya Iwakura | 3 | 68 | 15.95 |
Minoru Sasaki | 4 | 41 | 6.44 |
Hiroyuki Shinnou | 5 | 107 | 30.12 |