Title
Gem-based entity-knowledge maintenance
Abstract
Knowledge bases about entities have become a vital asset for Web search, recommendations, and analytics. Examples are Freebase being the core of the Google Knowledge Graph and the use of Wikipedia for distant supervision in numerous IR and NLP tasks. However, maintaining the knowledge about not so prominent entities in the long tail is often a bottleneck as human contributors face the tedious task of continuously identifying and reading relevant sources. To overcome this limitation and accelerate the maintenance of knowledge bases, we propose an approach that automatically extracts, from the Web, key contents for given input entities. Our method, called GEM, generates salient contents about a given entity, using minimal assumptions about the underlying sources, while meeting the constraint that the user is willing to read only a certain amount of information. Salient content pieces have variable length and are computed using a budget-constrained optimization problem which decides upon which sub-pieces of an input text should be selected for the final result. GEM can be applied to a variety of knowledge-gathering settings including news streams and speech input from videos. Our experimental studies show the viability of the approach, and demonstrate improvements over various baselines, in terms of precision and recall.
Year
DOI
Venue
2013
10.1145/2505515.2505715
CIKM
Keywords
Field
DocType
google knowledge graph,salient content,budget-constrained optimization problem,web search,input text,speech input,knowledge base,gem-based entity-knowledge maintenance,nlp task,salient content piece,input entity,relatedness
Data mining,Bottleneck,Computer science,Baseline (configuration management),Artificial intelligence,Natural language processing,Long tail,Analytics,Optimization problem,Information retrieval,Precision and recall,Novelty,Salient
Conference
Citations 
PageRank 
References 
3
0.46
23
Authors
2
Name
Order
Citations
PageRank
Bilyana Taneva141014.37
Gerhard Weikum2127102146.01