Title
Verifying genre-based clustering approach to content extraction
Abstract
The content of a webpage is usually contained within a small body of text and images, or perhaps several articles on the same page; however, the content may be lost in the clutter, particularly hurting users browsing on small cell phone and PDA screens and visually impaired users relying on speed rendering of web pages. Using the genre of a web page, we have created a solution, Crunch that automatically identifies clutter and removes it, thus leaving a clean content-full page. In order to evaluate the improvement in the applications for this technology, we identified a number of experiments. In this paper, we have those experiments, the associated results and their evaluation.
Year
DOI
Venue
2006
10.1145/1135777.1135922
WWW
Keywords
Field
DocType
small cell phone,speed rendering,web page,pda screen,associated result,genre-based clustering approach,small body,clean content-full page,content extraction,html,clustering,accessibility,web pages
Static web page,Content extraction,World Wide Web,Web page,Information retrieval,Computer science,Clutter,Phone,Rendering (computer graphics),Cluster analysis
Conference
ISBN
Citations 
PageRank 
1-59593-323-9
4
0.45
References 
Authors
2
4
Name
Order
Citations
PageRank
Suhit Gupta1895.39
Hila Becker271730.57
Gail E. Kaiser32262467.05
Sal Stolfo46210741.96