Title
Mining High-Quality Fine-Grained Type Information from Chinese Online Encyclopedias.
Abstract
Entity typing is a necessary step in building knowledge graphs. So far, plenty of efforts have been made in mining type information for entities from online encyclopedias, but usually only coarse-grained type information could be obtained for entities, which are not fine enough for the purpose of knowledge graphs construction or query answering. The situation becomes even worse for mining type information for entities in Chinese. In this paper, we work on mining high-quality fine-grained type information for entities from not only the title-labels and info-boxes in the entity’s encyclopedias page, but also the abstracts and crowd-labels in the page, which could provide a lot more candidate fine-grained type information (with noises). To maintain the high quality of the mined type information, initially we only get reliable type information from the title-labels and info-boxes. Then by putting entities, attributes, values and types into one graph, some path information can be obtained between each candidate entity-type pair, then we rely on a proposed Path-CNN binary classification model to identify more correct entity-type pairs from the graph. Compared with the previous approach and DBpedia, our work could mine a lot more high-quality fine-grained type information for entities from the online encyclopedia. By performing our approach on the largest Chinese online encyclopedia, Baidu Baike, we have generated 25,651,022 type information (with more than 80% accuracy) for the entities involved in this encyclopedia.
Year
Venue
Field
2018
WISE
Data mining,Graph,Knowledge graph,Binary classification,Information retrieval,Computer science,Encyclopedia,Online encyclopedia
DocType
Citations 
PageRank 
Conference
1
0.35
References 
Authors
16
4
Name
Order
Citations
PageRank
Maoxiang Hao110.35
Zhixu Li221043.55
Yan Zhao3459.79
Kai Zheng493669.43