Abstract | ||
---|---|---|
Linking entities like people, organisations, books, music groups and their songs in text to knowledge bases (KBs) is becoming a fundamental task in many downstream search and mining tasks. Achieving a high disambiguation accuracy crucially depends on a rich and holistic representation of the entities in the KB. For popular entities, such a representation can be easily mined from Wikipedia, and many current entity disambiguation and linking methods make use of this fact. However, Wikipedia does not contain long-tail entities that only few people are interested in, and also it sometimes lags behind for newly emerging entities. For such entities, mining a suitable representation in a fully automated fashion is very difficult, resulting in poor linking accuracy.
To address this issue, we propose a retrieval-based approach which leverages the human in the loop, prompting for user feedback on candidate entities and on characteristic keyphrases. Assuming that we can mine high-quality entity representations from relevant documents about a candidate entity, we explore the relevant keyphrase space by systematically exploiting human feedback using query expansions. Secondly, to account for specialization we adopt diversification of resulting documents to increase the coverage. Finally, we propose novel gradient interleaving methods to account for topic drift and user engagement. We conducted exten- sive experiments on the FACC dataset, showing that our approaches convincingly outperform carefully selected base- lines in both intrinsic and extrinsic measures while keeping the users engaged. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/2983323.2983798 | ACM International Conference on Information and Knowledge Management |
Keywords | DocType | Volume |
Knowledge Mining,Named Entity Disambiguation,Retrieval Model,Entity Description | Journal | abs/1810.10252 |
Citations | PageRank | References |
1 | 0.39 | 17 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jaspreet Singh | 1 | 20 | 2.96 |
Johannes Hoffart | 2 | 1362 | 52.62 |
Avishek Anand | 3 | 102 | 11.61 |