Title
Adapting Searchy to extract data using evolved wrappers
Abstract
Organizations need diverse information systems to deal with the increasing requirements in information storage and processing, yielding the creation of information islands and therefore an intrinsic difficulty to obtain a global view. Being able to provide such an unified view of the -likely heterogeneous-information available in an organization is a goal that provides added-value to the information systems and has been subject of intense research. In this paper we present an extension of a solution named Searchy, an agent-based mediator system specialized in data extraction and Integration. Through the use of a set of wrappers, it integrates information from arbitrary sources and semantically translates them according to a mediated scheme. Searchy is actually a domain-independent wrapper container that ease wrapper development, providing, for example, semantic mapping. The extension of Searchy proposed in this paper introduces an evolutionary wrapper that is able to evolve wrappers using regular expressions. To achieve this, a Genetic Algorithm (GA) is used to learn a regex able to extract a set of positive samples while rejects a set of negative samples.
Year
DOI
Venue
2012
10.1016/j.eswa.2011.08.168
Expert Syst. Appl.
Keywords
Field
DocType
domain-independent wrapper container,unified view,information system,genetic algorithm,diverse information system,adapting searchy,wrapper development,information island,global view,evolutionary wrapper,information storage,genetic algorithms,information extraction
Information system,Data mining,Zipf's law,Regular expression,Semantic mapping,Computer science,Information extraction,Artificial intelligence,Data extraction,Machine learning,Genetic algorithm,Alphabet
Journal
Volume
Issue
ISSN
39
3
0957-4174
Citations 
PageRank 
References 
3
0.37
16
Authors
3
Name
Order
Citations
PageRank
David F. Barrero112017.17
María D. R-Moreno29715.22
David Camacho327824.89