Title
Transforming Wikipedia into a large scale multilingual concept network
Abstract
A knowledge base for real-world language processing applications should consist of a large base of facts and reasoning mechanisms that combine them to induce novel and more complex information. This paper describes an approach to deriving such a large scale and multilingual resource by exploiting several facets of the on-line encyclopedia Wikipedia. We show how we can build upon Wikipedia@?s existing network of categories and articles to automatically discover new relations and their instances. Working on top of this network allows for added information to influence the network and be propagated throughout it using inference mechanisms that connect different pieces of existing knowledge. We then exploit this gained information to discover new relations that refine some of those found in the previous step. The result is a network containing approximately 3.7 million concepts with lexicalizations in numerous languages and 49+ million relation instances. Intrinsic and extrinsic evaluations show that this is a high quality resource and beneficial to various NLP tasks.
Year
DOI
Venue
2013
10.1016/j.artint.2012.06.008
Artif. Intell.
Keywords
Field
DocType
new relation,large scale multilingual concept,knowledge base,high quality resource,complex information,million relation instance,million concept,large base,transforming wikipedia,multilingual resource,large scale,added information
Data science,Inference,Computer science,Exploit,Encyclopedia,Knowledge base,Instrumental and intrinsic value,Knowledge acquisition
Journal
Volume
Issue
ISSN
194,
1
0004-3702
Citations 
PageRank 
References 
43
1.05
45
Authors
2
Name
Order
Citations
PageRank
Vivi Nastase152341.30
Michael Strube22142137.32