Title
Knowledge vault: a web-scale approach to probabilistic knowledge fusion.
Abstract
Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Microsoft's Satori, and Google's Knowledge Graph. To increase the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous approaches have primarily focused on text-based extraction, which can be very noisy. Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. We employ supervised machine learning methods for fusing these distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilistic inference system that computes calibrated probabilities of fact correctness. We report the results of multiple studies that explore the relative utility of the different information sources and extraction methods.
Year
DOI
Venue
2014
10.1145/2623330.2623623
KDD
Keywords
Field
DocType
information extraction,knowledge bases,machine learning,probabilistic models,statistical databases,textual databases
Data mining,Computer science,Correctness,Knowledge-based systems,Information extraction,Artificial intelligence,Knowledge extraction,Knowledge base,Probabilistic logic,Web content,Machine learning,Open Knowledge Base Connectivity
Conference
Citations 
PageRank 
References 
440
11.51
39
Authors
9
Search Limit
100440
Name
Order
Citations
PageRank
Xin Luna Dong12524129.18
Evgeniy Gabrilovich24573224.48
geremy heitz3107652.33
Wilko Horn454914.20
Ni Lao598639.73
Michael Kuperberg67589529.66
Thomas Strohmann744011.51
Shaohua Sun862216.73
Wei Zhang945219.35