Abstract | ||
---|---|---|
This paper describes work on Named Entity Recognition (NER), in preparation for Relation Extraction (RE), on data from a historical archive organisation. As is often the case in the cultural heritage domain, the source text includes a high percentage of specialist terminology, and is of very variable quality in terms of grammaticality and completeness. The NER and RE tasks were carried out using a specially annotated corpus, and are themselves preliminary steps in a larger project whose aim is to transform discovered relations into a graph structure that can be queried using standard tools. Experimental results from the NER task are described, with emphasis on dealing with nested entities using a multi-word token method. The overall objective is to improve access by non-specialist users to a valuable cultural resource. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1109/ICSC.2007.63 | ICSC |
Keywords | Field | DocType |
relation extraction,source text,information retrieval systems,relational databases,cultural heritage | Data mining,Relational database,Cultural heritage,Computer science,Natural language processing,Artificial intelligence,Relationship extraction,Entity linking,Information retrieval,Terminology,Grammaticality,Source text,Named-entity recognition | Conference |
ISBN | Citations | PageRank |
0-7695-2997-6 | 23 | 0.98 |
References | Authors | |
12 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kate Byrne | 1 | 59 | 4.36 |