Title
Sift: an end-user tool for gathering web content on the go
Abstract
Although web sites have started to embed semantic metadata within their documents, it remains a challenge for non-technical end-users to exploit that markup to extract and store information of interest. To address this challenge, we show how tools can be developed that allow users to identify extractable information while browsing and then control how that information should be extracted and stored in a personal library. The proposed approach is based on an extensible framework capable of using different kinds of markup to aid the extraction process and a unique fusion of several well-established techniques from areas such as the semantic web, data warehousing, web scraping and web feeds. We present the Sift tool which is a proof-of-concept implementation of the approach.
Year
DOI
Venue
2012
10.1145/2361354.2361395
ACM Symposium on Document Engineering
Keywords
Field
DocType
sift tool,web content,extractable information,end-user tool,web site,data warehousing,store information,web scraping,different kind,embed semantic metadata,semantic web,proof of concept,information extraction
Web development,World Wide Web,Information retrieval,Web page,Semantic Web Stack,Computer science,Data Web,Semantic Web,Web modeling,Social Semantic Web,HTML,Database
Conference
Citations 
PageRank 
References 
4
0.50
18
Authors
3
Name
Order
Citations
PageRank
Matthias Geel1486.46
Timothy Church240.50
Moira C. Norrie31317201.70