Abstract | ||
---|---|---|
Despite progress in the development of computational means, human input is still critical in the production of consistent and useable aligned corpora and term banks. This is especially true for specialized corpora and term banks whose end-users are often professionals with very stringent requirements for accuracy, consistency and coverage. In the compilation of a high quality Chinese-English legal glossary for ELDoS project, we have identified a number of issues that make the role human input critical for term alignment and extraction. They include the identification of low frequency terms, paraphrastic expressions, discontinuous units, and maintaining consistent term granularity, etc. Although manual intervention can more satisfactorily address these issues, steps must also be taken to address intra- and inter-annotator inconsistency. |
Year | DOI | Venue |
---|---|---|
2002 | 10.3115/1118824.1118826 | SIGHAN@COLING |
Keywords | Field | DocType |
term bank,human input,low frequency term,terminology extraction,term alignment,consistent term granularity,chinese-english legal glossary,role human input,discontinuous unit,bilingual alignment,computational mean,eldos project,low frequency | Expression (mathematics),Computer science,Artificial intelligence,Natural language processing,Granularity,Glossary,Terminology extraction | Conference |
Citations | PageRank | References |
2 | 0.39 | 4 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lawrence Cheung | 1 | 2 | 0.39 |
Tom B. Y Lai | 2 | 46 | 7.75 |
Robert Luk | 3 | 97 | 5.88 |
Oi Yee Kwong | 4 | 250 | 30.70 |
KingKui Sin | 5 | 2 | 0.73 |
Benjamin K. Tsou | 6 | 380 | 31.76 |