Title
A tree-based WQI modeling approach for integrating Web databases
Abstract
Everyday, more and more specialized databases (car rental, hotels, airfares, etc.) are available on the Web and can be only queried by means of a Web Query Interface (WQI). Since in the Web is increasing the number of domain-specific databases, it is getting very complicated for end users to explore the information stored in them. In this context, research efforts are focused on building a single (unified) specific-domain WQI that allows user to query and integrate information available in different Web databases. The construction of such integrated WQI, for a given domain, involves several complex tasks, specially the extraction, representation, understanding and mapping of semantic content of each individual WQI associated to a web database. Previous approaches have considered hierarchical models to build integrated WQI, preserving the ancestor-descendant relationships in individual WQIs. In this work, we propose a novel tree-based approach for automatic construction of a hierarchical model of visual content of WQIs, representing their components in a clear and concise form. In the proposed approach, the Document Object Model(DOM) tree of each WQI considered in the integration process is processed by a specialized web resource to obtain relevant visual information in the WQI such as fields (UIs), groups of UIs and super-groups as well as their corresponding labels. This process is guided by a set of 8 design heuristic rules for the right identification of labels and components. Experiments to evaluate the proposed strategy were conducted on the ICQ and Tel-8 datasets of UIUC repository. Our results showed that the proposed tree-based approach for representing the visual components in a WQI has more than 94% of accuracy, improving current reported approaches and making easier the integration process of domain-specifi
Year
Venue
Keywords
2014
Information Fusion
database management systems,document handling,query processing,trees (mathematics),user interfaces,Document Object Model tree,ICQ dataset,Tel-8 dataset,UIUC repository,Web database integration,Web query interface,Web resource,ancestor-descendant relationships,domain-specific databases,heuristic rules,hierarchical models,semantic content extraction,semantic content mapping,semantic content representation,semantic content understanding,tree-based WQI modeling approach
Field
DocType
Citations 
Web resource,Web search query,Information retrieval,Semantic Web Stack,Computer science,Data Web,Semantic Web,Document Object Model,Social Semantic Web,Hierarchical database model,Database
Conference
0
PageRank 
References 
Authors
0.34
0
5