Abstract | ||
---|---|---|
Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. In this paper, we introduce WebIsALOD, a Linked Open Data release of the IsA database, containing 400M hypernymy relations, each provided with rich provenance information. As the original dataset contained more than 80% wrong, noisy extractions, we run a machine learning algorithm to assign confidence scores to the individual statements. Furthermore, 2.5M links to DBpedia and 23.7k links to the YAGO class hierarchy were created at a precision of 97%. In total, the dataset contains 5.4B triples. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1007/978-3-319-68204-4_11 | Lecture Notes in Computer Science |
Keywords | Field | DocType |
Hypernyms,Hearst patterns,Linked dataset | Ontology (information science),Data mining,Information retrieval,Computer science,Linked data,Semantic Web,Class hierarchy | Conference |
Volume | ISSN | Citations |
10588 | 0302-9743 | 3 |
PageRank | References | Authors |
0.41 | 5 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sven Hertling | 1 | 61 | 12.33 |
Heiko Paulheim | 2 | 1095 | 84.19 |