Title
Semantic Labeling: A Domain-Independent Approach.
Abstract
Semantic labeling is the process of mapping attributes in data sources to classes in an ontology and is a necessary step in heterogeneous data integration. Variations in data formats, attribute names and even ranges of values of data make this a very challenging task. In this paper, we present a novel domain-independent approach to automatic semantic labeling that uses machine learning techniques. Previous approaches use machine learning to learn a model that extracts features related to the data of a domain, which requires the model to be re-trained for every new domain. Our solution uses similarity metrics as features to compare against labeled domain data and learns a matching function to infer the correct semantic labels for data. Since our approach depends on the learned similarity metrics but not the data itself, it is domain-independent and only needs to be trained once to work effectively across multiple domains. In our evaluation, our approach achieves higher accuracy than other approaches, even when the learned models are trained on domains other than the test domain.
Year
DOI
Venue
2016
10.1007/978-3-319-46523-4_27
Lecture Notes in Computer Science
Field
DocType
Volume
Data integration,Ontology,Data mining,Computer science,Semantic labeling,Natural language processing,Artificial intelligence,Jaccard index,Random forest
Conference
9981
ISSN
Citations 
PageRank 
0302-9743
12
0.61
References 
Authors
8
4
Name
Order
Citations
PageRank
Minh Pham1142.68
Suresh Alse2120.95
Craig A. Knoblock35229680.57
Pedro Szekely41217179.80