Abstract | ||
---|---|---|
Ambiguities, which are inherently present in natural languages represent a challenge of determining the actual identities of entities mentioned in a document (e.g., Paris can refer to a city in France but it can also refer to a small city in Texas, USA or to a 1984 film directed by Wim Wenders having title Paris, Texas). Disambiguation is a problem that can be successfully solved by entity resolution methods.This paper studies various methods for estimating relatedness between entities, used in collective entity resolution. We define a unified entity resolution approach, capable of using implicit as well as explicit relatedness for collectively identifying in-text entities. As a relatedness measure, we propose a method, which expresses relatedness using the heterogeneous relations of a domain ontology. We also experiment with other relatedness measures, such as using statistical learning of co-occurrences of two entities or using content similarity between them. Evaluation on real data shows that the new methods for relatedness estimation give good results. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/978-3-642-10871-6_7 | ASWC |
Keywords | Field | DocType |
natural language,entity resolution | Ontology,Semantic integration,Data mining,Semantic annotation,Computer science,Natural language processing,Statistical learning,Artificial intelligence,Entity linking,Ontology (information science),Name resolution,Information retrieval,Natural language | Conference |
Volume | ISSN | Citations |
5926 | 0302-9743 | 7 |
PageRank | References | Authors |
0.53 | 30 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tadej Stajner | 1 | 32 | 4.78 |
Dunja Mladenic | 2 | 1484 | 170.14 |