Title
Classifying Structured Web Sources Using Support Vector Machine and Aggressive Feature Selection
Abstract
This paper studies the problem of classifying structured data sources on the Web. While prior works use all features, once extracted from search interfaces, we further refine the feature set. In our research, we use only the text content of the search interfaces. We choose a subset of features, which is suited to classify web sources, by our feature selection methods with new metrics and a novel simple ranking scheme. Using aggressive feature selection approach, together with a Support Vector Machine classifier, we obtained high classification performance in an evaluation over real web data.
Year
DOI
Venue
2009
10.1007/978-3-642-12436-5_20
Lecture Notes in Business Information Processing
Keywords
Field
DocType
Deep web,Classification,Database,Feature selection,SVM,Support Vector Machine
Structured support vector machine,Data mining,Feature vector,Feature selection,Ranking,Feature (computer vision),Computer science,Support vector machine,Feature set,Artificial intelligence,Data model,Machine learning
Conference
Volume
ISSN
Citations 
45
1865-1348
2
PageRank 
References 
Authors
0.38
25
2
Name
Order
Citations
PageRank
Hieu Quang Le120.38
Stefan Conrad2168105.91