Abstract | ||
---|---|---|
Measuring the semantic relatedness between two entities is the basis for numerous tasks in IR, NLP, and Web-based knowledge extraction. This paper focuses on disambiguating names in a Web or text document by jointly mapping all names onto semantically related entities registered in a knowledge base. To this end, we have developed a novel notion of semantic relatedness between two entities represented as sets of weighted (multi-word) keyphrases, with consideration of partially overlapping phrases. This measure improves the quality of prior link-based models, and also eliminates the need for (usually Wikipedia-centric) explicit interlinkage between entities. Thus, our method is more versatile and can cope with long-tail and newly emerging entities that have few or no links associated with them. For efficiency, we have developed approximation techniques based on min-hash sketches and locality-sensitive hashing. Our experiments on semantic relatedness and on named entity disambiguation demonstrate the superiority of our method compared to state-of-the-art baselines. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1145/2396761.2396832 | CIKM |
Keywords | Field | DocType |
overlapping phrase,approximation technique,explicit interlinkage,novel notion,min-hash sketch,knowledge base,disambiguating name,web-based knowledge extraction,semantic relatedness,entity disambiguation,numerous task,locality sensitive hashing | Entity linking,Semantic similarity,Locality-sensitive hashing,Information retrieval,Computer science,Knowledge extraction,Artificial intelligence,Hash function,Natural language processing,Knowledge base,Text document | Conference |
Citations | PageRank | References |
94 | 2.80 | 31 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Johannes Hoffart | 1 | 1362 | 52.62 |
Stephan Seufert | 2 | 279 | 10.69 |
Dat Ba Nguyen | 3 | 127 | 5.87 |
Martin Theobald | 4 | 1474 | 72.06 |
Gerhard Weikum | 5 | 12710 | 2146.01 |