Title
HighLife: Higher-arity Fact Harvesting.
Abstract
Text-based knowledge extraction methods for populating knowledge bases have focused on binary facts: relationships between two entities. However, in advanced domains such as health, it is often crucial to consider ternary and higher-arity relations. An example is to capture which drug is used for which disease at which dosage (e.g. 2.5 mg/day) for which kinds of patients (e.g., children vs. adults). In this work, we present an approach to harvest higher-arity facts from textual sources. Our method is distantly supervised by seed facts, and uses the fact-pattern duality principle to gather fact candidates with high recall. For high precision, we devise a constraint-based reasoning method to eliminate false candidates. A major novelty is in coping with the difficulty that higher-arity facts are often expressed only partially in texts and strewn across multiple sources. For example, one sentence may refer to a drug, a disease and a group of patients, whereas another sentence talks about the drug, its dosage and the target group without mentioning the disease. Our methods cope well with such partially observed facts, at both pattern-learning and constraint-reasoning stages. Experiments with health-related documents and with news articles demonstrate the viability of our method.
Year
DOI
Venue
2018
10.1145/3178876.3186000
WWW '18: The Web Conference 2018 Lyon France April, 2018
Field
DocType
ISBN
Knowledge graph,Arity,Computer science,Coping (psychology),Natural language processing,Artificial intelligence,Knowledge extraction,Novelty,Recall,Sentence,Machine learning
Conference
978-1-4503-5639-8
Citations 
PageRank 
References 
4
0.40
40
Authors
3
Name
Order
Citations
PageRank
Patrick Ernst1706.51
Amy Siu282.83
Gerhard Weikum3127102146.01