Abstract | ||
---|---|---|
ABSTRACT Machine knowledge about the world’s entities should include quantity properties, such as heights of buildings, running times of athletes, energy efficiency of car models, energy production of power plants, and more. State-of-the-art knowledge bases (KBs), such as Wikidata, cover many relevant entities but often miss the corresponding quantities. Prior work on extracting quantity facts from web contents focused on high precision for top-ranked outputs, but did not tackle the KB coverage issue. This paper presents a recall-oriented approach which aims to close this gap in knowledge-base coverage. Our method is based on iterative learning for extracting quantity facts, with two novel contributions to boost recall for KB augmentation without sacrificing the quality standards of the knowledge base. The first contribution is a query expansion technique to capture a larger pool of fact candidates. The second contribution is a novel technique for harnessing observations on value distributions for self-consistency. Experiments with extractions from more than 13 million web documents demonstrate the benefits of our method. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3485447.3511932 | International World Wide Web Conference |
Keywords | DocType | Citations |
Knowledge Bases, Information Extraction, Quantity Facts | Conference | 0 |
PageRank | References | Authors |
0.34 | 9 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Le Vinh Thinh | 1 | 20 | 2.15 |
Daria Stepanova | 2 | 46 | 12.10 |
Dragan Milchevski | 3 | 0 | 0.68 |
Jannik Strötgen | 4 | 492 | 38.20 |
Gerhard Weikum | 5 | 12710 | 2146.01 |