Title
Towards a method for unsupervised web information extraction
Abstract
The literature provides a variety of techniques to build the information extractors on which some data integration systems rely. Information extraction techniques are usually based on extraction rules that require maintenance and adaptation if web sources change. We present our preliminary steps towards an unsupervised information extraction technique that searches web documents for shared patterns and fragments them until finding the relevant information that should be extracted. Experimental results on 1230 real-web documents demonstrate that our system performs fast and achieves promising results.
Year
DOI
Venue
2012
10.1007/978-3-642-31753-8_36
ICWE
Keywords
Field
DocType
extraction rule,unsupervised information extraction technique,data integration system,information extraction technique,web sources change,relevant information,preliminary step,web document,information extractor,unsupervised web information extraction
Data integration,Data mining,World Wide Web,Information retrieval,Computer science,Information extraction,Web information,Relationship extraction
Conference
Citations 
PageRank 
References 
1
0.35
10
Authors
2
Name
Order
Citations
PageRank
Hassan A. Sleiman11038.33
Rafael Corchuelo238949.87