Title
Building linked open data using approximate string matching methods and domain specific resources
Abstract
We built a linked open data set on the Allie database that stores abbreviation-long form pairs in life sciences. We tried to link long forms to DBpedia entries using key collision methods (i.e., fingerprint and n-gram fingerprint). In addition, we used UMLS to absorb fluctuations of terms in life science. As a result of combining the key collision methods with the domain-specific tools/dictionaries, more than five-sevenths of long forms in Allie have links to DBpedia entries when they appear 100 times or more in MEDLINE, and around 90 percent of those have links to them when their appearance frequencies are 500 or more. The string matching result achieved an F-measure of 0.98, and the number of links between Allie and DBpedia is 77 608. This outcome helps Allie users to find knowledge related to the long forms of interest.
Year
DOI
Venue
2011
10.1145/2166896.2166927
SWAT4LS
Keywords
Field
DocType
open data,domain-specific tool,allie user,allie database,life science,n-gram fingerprint,long form,dbpedia entry,approximate string,appearance frequency,domain specific resource,key collision method,string matching,linked open data,linked data,approximate string matching
String searching algorithm,Data mining,Information retrieval,Computer science,Linked data,Fingerprint,Collision,Approximate string matching,MEDLINE,Unified Medical Language System
Conference
Citations 
PageRank 
References 
0
0.34
1
Authors
3
Name
Order
Citations
PageRank
Yasunori Yamamoto114322.77
Atsuko Yamaguchi214916.11
Akinori Yonezawa31613226.97