Abstract | ||
---|---|---|
This work describes how derivation tree fragments based on a variant of Tree Adjoining Grammar (TAG) can be used to check treebank consistency. Annotation of word sequences are compared both for their internal structural consistency, and their external relation to the rest of the tree. We expand on earlier work in this area in three ways. First, we provide a more complete description of the system, showing how a naive use of TAG structures will not work, leading to a necessary refinement. We also provide a more complete account of the processing pipeline, including the grouping together of structurally similar errors and their elimination of duplicates. Second, we include the new experimental external relation check to find an additional class of errors. Third, we broaden the evaluation to include both the internal and external relation checks, and evaluate the system on both an Arabic and English treebank. The evaluation has been successful enough that the internal check has been integrated into the standard pipeline for current English treebank construction at the Linguistic Data Consortium |
Year | Venue | Keywords |
---|---|---|
2012 | LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | Quality control,treebanking,Tree Adjoining Grammar |
Field | DocType | Citations |
Tree-adjoining grammar,Linguistic Data Consortium,Annotation,Arabic,Computer science,Error detection and correction,Artificial intelligence,Treebank,Natural language processing | Conference | 2 |
PageRank | References | Authors |
0.52 | 5 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Seth Kulick | 1 | 221 | 29.66 |
Ann Bies | 2 | 136 | 20.02 |
Justin Mott | 3 | 27 | 4.93 |