Abstract | ||
---|---|---|
The value of language resources is greatly enhanced if they share a common markup with an explicit minimal semantics. Achieving this goal for lexical databases is difficult, as large-scale resources can realistically only be obtained by up-transla tion from pre-existing dictionaries, each with its own proprietary structure. Thi s issue is a central concern in our work in the CONCEDE project, which aims to develop compatible lexical databases for six Central and Eastern European languages. This paper describes the approach we have taken in CONCEDE. Starting with sample entries from original presentation-oriented electronic representations of diction aries, we discuss how we first transform the data into an intermediate TEI-compati ble representation, and from there into a more restrictive s hared encoding, formalised as an XML DTD with a clearly-defined semantic inte rpretation. |
Year | Venue | Field |
---|---|---|
2000 | LREC | Computer science,Semantic interpretation,Natural language processing,Artificial intelligence,Hyperlink,Semantics,Database,Markup language,Encoding (memory),Document type definition |
DocType | Citations | PageRank |
Conference | 5 | 3.95 |
References | Authors | |
4 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tomaz Erjavec | 1 | 537 | 60.89 |
Roger Evans | 2 | 344 | 55.12 |
Nancy Ide | 3 | 82 | 15.74 |
Adam Kilgariff | 4 | 5 | 3.95 |