Title
Learning semantic definitions of online information sources
Abstract
The Internet contains a very large number of information sources providing many types of data from weather forecasts to travel deals and financial information. These sources can be accessed via Web-forms, Web Services, RSS feeds and so on. In order to make automated use of these sources, we need to model them semantically, but writing semantic descriptions for Web Services is both tedious and error prone. In this paper we investigate the problem of automatically generating such models. We introduce a framework for learning Datalog definitions of Web sources. In order to learn these definitions, our system actively invokes the sources and compares the data they produce with that of known sources of information. It then performs an inductive logic search through the space of plausible source definitions in order to learn the best possible semantic model for each new source. In this paper we perform an empirical evaluation of the system using real-world Web sources. The evaluation demonstrates the effectiveness of the approach, showing that we can automatically learn complex models for real sources in reasonable time. We also compare our system with a complex schema matching system, showing that our approach can handle the kinds of problems tackled by the latter.
Year
DOI
Venue
2007
10.1613/jair.2205
J. Artif. Intell. Res. (JAIR)
Keywords
Field
DocType
complex model,information source,online information source,known source,web services,financial information,real-world web source,web source,new source,semantic definition,empirical evaluation,complex schema,web service,semantic model,weather forecasting
Information retrieval,Computer science,Data type,Artificial intelligence,Social Semantic Web,Schema matching,Web service,Datalog,RSS,Machine learning,The Internet,Semantic data model
Journal
Volume
Issue
ISSN
30
1
1076-9757
Citations 
PageRank 
References 
17
0.93
25
Authors
46
Name
Order
Citations
PageRank
Mark Carman156349.18
Craig A. Knoblock25229680.57
Vadim Bulitko367067.16
Nathan R. Sturtevant478080.81
junmin lu5170.93
Timothy Yau6855.01
Marco Pistore73021181.74
Moshe Y. Vardi8134132267.07
jonathan bredin9170.93
David C. Parkes103293342.69
quang duong11170.93
Simone Paolo Ponzetto122280129.35
Michael Strube132142137.32
Ariel Felner141239105.75
Richard E. Korf153568729.78
ram meshulam16462.51
Robert C. Holte173041414.38
Andrew Kachites McCallumzy18192031588.22
x wang19172.28
Andrés Corrada-Emmanuel2042637.13
Giorgos Stoilos21124167.47
George P. Stamou22201.77
Jeff Z. Pan232218158.01
Vassilis Tzouvaras2465241.89
Ian Horrocks25117311086.65
C. Li26213.49
felip manya27170.93
jordi planes28170.93
john c bell29170.93
Marilyn A Walker303893418.91
Amanda J. Stent311094103.35
francois mairesse32170.93
rashmi prasad33170.93
Matthias R. Mehl3428419.65
Roger K. Moore35594137.04
sergio greco36170.93
irina trubitsyna37170.93
Ester Zumpano3851862.16
simon i hill39170.93
Arnaud Doucet403891525.98
Carmel Domshlak412156123.57
james arthur hoffmann42170.93
Indrajit Bhattacharya4361933.31
Lise Getoor444365320.21
István Szita4534525.48
András Lörincz46266.61