Title
Automatic Adaptation of Annotations.
Abstract
Manually annotated corpora are indispensable resources, yet for many annotation tasks, such as the creation of treebanks, there exist multiple corpora with different and incompatible annotation guidelines. This leads to an inefficient use of human expertise, but it could be remedied by integrating knowledge across corpora with different annotation guidelines. In this article we describe the problem of annotation adaptation and the intrinsic principles of the solutions, and present a series of successively enhanced models that can automatically adapt the divergence between different annotation formats. We evaluate our algorithms on the tasks of Chinese word segmentation and dependency parsing. For word segmentation, where there are no universal segmentation guidelines because of the lack of morphology in Chinese, we perform annotation adaptation from the much larger People's Daily corpus to the smaller but more popular Penn Chinese Treebank. For dependency parsing, we perform annotation adaptation from the Penn Chinese Treebank to a semantics-oriented Dependency Treebank, which is annotated using significantly different annotation guidelines. In both experiments, automatic annotation adaptation brings significant improvement, achieving state-of-the-art performance despite the use of purely local features in training.
Year
DOI
Venue
2015
10.1162/COLI_a_00210
Computational Linguistics
Field
DocType
Volume
Annotation,Information retrieval,Segmentation,Computer science,Text segmentation,Dependency grammar,Natural language processing,Artificial intelligence,Treebank
Journal
41
Issue
ISSN
Citations 
1
0891-2017
0
PageRank 
References 
Authors
0.34
45
4
Name
Order
Citations
PageRank
Wenbin Jiang135536.55
Yajuan Lü227620.00
Liang Huang3148475.40
Qun Liu42149203.11