Title | ||
---|---|---|
Classifying Structured Web Sources Using Support Vector Machine and Aggressive Feature Selection |
Abstract | ||
---|---|---|
This paper studies the problem of classifying structured data sources on the Web. While prior works use all features, once extracted from search interfaces, we further refine the feature set. In our research, we use only the text content of the search interfaces. We choose a subset of features, which is suited to classify web sources, by our feature selection methods with new metrics and a novel simple ranking scheme. Using aggressive feature selection approach, together with a Support Vector Machine classifier, we obtained high classification performance in an evaluation over real web data. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/978-3-642-12436-5_20 | Lecture Notes in Business Information Processing |
Keywords | Field | DocType |
Deep web,Classification,Database,Feature selection,SVM,Support Vector Machine | Structured support vector machine,Data mining,Feature vector,Feature selection,Ranking,Feature (computer vision),Computer science,Support vector machine,Feature set,Artificial intelligence,Data model,Machine learning | Conference |
Volume | ISSN | Citations |
45 | 1865-1348 | 2 |
PageRank | References | Authors |
0.38 | 25 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hieu Quang Le | 1 | 2 | 0.38 |
Stefan Conrad | 2 | 168 | 105.91 |