Title
Mix-n-Match: building personal libraries from web content
Abstract
We present an approach to web content aggregation that allows information to be harvested from web pages, independent of specific markup languages. It builds on ideas from data warehousing and we present solutions to the well-known problems of data integration, namely detection of equivalences and data cleaning, adapted to this context. We describe how the content aggregation engine has been realised as an extensible framework in such a way that end-users as well as developers can use the associated tools to create personal libaries of content extracted from the web.
Year
DOI
Venue
2012
10.1007/978-3-642-33290-6_37
TPDL
Keywords
Field
DocType
specific markup language,extensible framework,web page,personal library,data warehousing,data integration,personal libaries,content aggregation engine,web content aggregation,present solution,associated tool
Data warehouse,Data integration,Web development,Data mining,World Wide Web,Information retrieval,Web page,Computer science,Web modeling,Web content,Markup language
Conference
Citations 
PageRank 
References 
1
0.36
11
Authors
3
Name
Order
Citations
PageRank
Matthias Geel1486.46
Timothy Church210.36
Moira C. Norrie31317201.70