Language-Agnostic Relation Extraction From Abstracts In Wikis - Citegraph

Paper Info

Title
Language-Agnostic Relation Extraction From Abstracts In Wikis

Abstract
Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6 M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted. In a second series of experiments, we show how the approach can be transferred to DBkWik, a knowledge graph extracted from thousands of Wikis. We discuss the challenges and first results of extracting relations from a larger set of Wikis, using a less formalized knowledge graph.

Year	DOI	Venue
2018	10.3390/info9040075	INFORMATION
Keywords	Field	DocType
relation extraction, knowledge graphs, Wikipedia, DBpedia, DBkWik, Wiki farms	Training set,Graph,Knowledge graph,Computer science,Exploit,Artificial intelligence,Natural language processing,Classifier (linguistics),Machine learning,Relationship extraction	Journal
Volume	Issue	Citations
9	4	2
PageRank	References	Authors
0.42	10	3

Authors (3 rows)

Cited by (2 rows)

References (10 rows)

Name	Order	Citations	PageRank
Nicolas Heist	1	2	2.45
Sven Hertling	2	61	12.33
Heiko Paulheim	3	1095	84.19

1