Title
Large Scale Web-Content Classification.
Abstract
Web classification is used in many security devices for preventing users to access selected web sites that are not allowed by the current security policy, as well for improving web search and for implementing contextual advertising. There are many commercial web classification services available on the market and a few publicly available web directory services. Unfortunately they mostly focus on English-speaking web sites, making them unsuitable for other languages in terms of classification reliability and coverage. This paper covers the design and implementation of a web-based classification tool for TLDs (Top Level Domain). Each domain is classified by analysing the main domain web site, and classifying it in categories according to its content. The tool has been successfully validated by classifying all the registered .it Internet domains, whose results are presented in this paper.
Year
DOI
Venue
2015
10.5220/0005635605450554
KDIR
Keywords
Field
DocType
Internet Domain, Web-Content Classification, HTTP Crawling, Web Mining, SVM
Web design,Web development,Data mining,World Wide Web,Information retrieval,Web page,Computer science,Web standards,Web query classification,Data Web,Web modeling,Web application security
Conference
Volume
ISBN
Citations 
01
978-1-5090-1967-0
0
PageRank 
References 
Authors
0.34
13
4
Name
Order
Citations
PageRank
Luca Deri129232.98
Maurizio Martinelli2207.36
Daniele Sartiano3103.98
Loredana Sideri400.68