Title
An Unsupervised Method for Linking Entity Mentions in Chinese Text.
Abstract
Entity linking is the process of linking entity mentions in text with the unambiguous entity objects in a knowledge base. The technology is a key step of expanding a knowledge base, and can improve the information filtering ability of online recommendation systems, search engines, and other practical applications. However, the large number of entities, the diversity and ambiguity of entity names bring huge challenges for entity linking research. In addition, the rare Chinese knowledge bases and the complex syntax of Chinese text restrict researching Chinese entity linking technologies. In order to meet the processing requirement of Chinese text, we propose an unsupervised Chinese entity linking method, namely un-CEML. This method uses Baidu encyclopedia as a knowledge base, exploits a similarity algorithm to obtain entries from Baidu encyclopedia, and combines the characteristics of this encyclopedia to obtain candidate entities, which can handle the abbreviation and wrongly segmenting entity mentions, ensuring the size of candidate entities and the probability of containing the target entity. In the ranking stage of candidate entities, we obtain the strongly relevant information of entity mentions based on the dependencies of components in a sentence as the context information, to reduce the noise of calculating the similarity with candidate entities. Because the nominal mentions are mostly common words, small correlation with the document knowledge, we deal with them separately. We conduct experiments on real data sets, and compare with some standard methods. The experimental results show that our method can solve the ambiguity problem of Chinese entity mentions, and achieve high accuracy of linking results.
Year
DOI
Venue
2016
10.1007/978-3-319-49178-3_14
ADVANCES IN SERVICES COMPUTING
Keywords
Field
DocType
Entity linking,Baidu encyclopedia,Information extraction,Unsupervised,Chinese text
Entity linking,Recommender system,Data mining,Information retrieval,Ranking,Computer science,Information extraction,Encyclopedia,Knowledge base,Ambiguity,Sentence,Distributed computing
Conference
Volume
ISSN
Citations 
10065
0302-9743
0
PageRank 
References 
Authors
0.34
12
4
Name
Order
Citations
PageRank
Jing Xu101.01
Liang Gan201.35
Bin Zhou3719.48
Quanyuan Wu415326.73