Abstract | ||
---|---|---|
The Data Web contains a wealth of knowledge on a large number of domains. Question answering over interlinked data sources is challenging due to two inherent characteristics. First, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain question. Second, constructing a federated formal query across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present a question answering system, which transforms user supplied queries (i.e. natural language sentences or keywords) into conjunctive SPARQL queries over a set of interlinked data sources. The contribution of this paper is two-fold: Firstly, we introduce a novel approach for determining the most suitable resources for a user-supplied query from different datasets (disambiguation). We employ a hidden Markov model, whose parameters were bootstrapped with different distribution functions. Secondly, we present a novel method for constructing a federated formal queries using the disambiguated resources and leveraging the linking structure of the underlying datasets. This approach essentially relies on a combination of domain and range inference as well as a link traversal method for constructing a connected graph which ultimately renders a corresponding SPARQL query. The results of our evaluation with three life-science datasets and 25 benchmark queries demonstrate the effectiveness of our approach. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2488388.2488488 | WWW |
Keywords | Field | DocType |
interlinked data source,certain question,different datasets,different distribution function,life-science datasets,question answering,conjunctive sparql query,underlying datasets,federated formal query,corresponding sparql query,benchmark query,rdf,hidden markov model,sparql,linked data | Data mining,Computer science,SPARQL,Artificial intelligence,Recommender system,World Wide Web,Question answering,Tree traversal,Information retrieval,Inference,Data Web,Natural language,Hidden Markov model,Machine learning | Conference |
ISBN | Citations | PageRank |
978-1-4503-2035-1 | 27 | 1.14 |
References | Authors | |
30 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Saeedeh Shekarpour | 1 | 40 | 3.70 |
Axel-Cyrille Ngonga Ngomo | 2 | 1775 | 139.40 |
Sören Auer | 3 | 5711 | 418.56 |