Title
An Experiment to Test URL Features for Web Page Classification.
Abstract
Web page classification has been extensively researched, using different types of features that are extracted either from the page content, the page structure or from other pages that link to that page. Using features from the page itself implies having to download it before its classification. We present an experiment to proof that URL tokens contain information enough to extract features to classify web pages. A classifier based on these features is able to classify a web page without having to download it previously, avoiding unnecessary downloads.
Year
DOI
Venue
2012
10.1007/978-3-642-28795-4_13
TRENDS IN PRACTICAL APPLICATIONS OF AGENTS AND MULTIAGENT SYSTEMS
Field
DocType
Volume
Same-origin policy,Static web page,World Wide Web,Information retrieval,Web page,Computer science,Download,Anchor text,Tree edit distance,Classifier (linguistics)
Conference
157
ISSN
Citations 
PageRank 
1867-5662
3
0.44
References 
Authors
16
4
Name
Order
Citations
PageRank
Inma Hernández17610.72
Carlos R. Rivero211116.25
David Ruiz315220.62
José Luis Arjona4195.71