Abstract | ||
---|---|---|
The literature provides a variety of techniques to build the information extractors on which some data integration systems rely. Information extraction techniques are usually based on extraction rules that require maintenance and adaptation if web sources change. We present our preliminary steps towards an unsupervised information extraction technique that searches web documents for shared patterns and fragments them until finding the relevant information that should be extracted. Experimental results on 1230 real-web documents demonstrate that our system performs fast and achieves promising results. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-31753-8_36 | ICWE |
Keywords | Field | DocType |
extraction rule,unsupervised information extraction technique,data integration system,information extraction technique,web sources change,relevant information,preliminary step,web document,information extractor,unsupervised web information extraction | Data integration,Data mining,World Wide Web,Information retrieval,Computer science,Information extraction,Web information,Relationship extraction | Conference |
Citations | PageRank | References |
1 | 0.35 | 10 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hassan A. Sleiman | 1 | 103 | 8.33 |
Rafael Corchuelo | 2 | 389 | 49.87 |