Title
Data selection based on decision tree for SVM classification on large data sets
Abstract
Graphical abstractDisplay Omitted HighlightsThis paper describes the development of an algorithm for training large data sets.The algorithm uses a first stage of SVM with a small data set.The algorithm uses decision trees to find best data points in the entire data set.DT is trained using SV and non-SV found in the first SVM stage.In the second SVM stage the training data represent all data points found by the DT. Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations.
Year
DOI
Venue
2015
10.1016/j.asoc.2015.08.048
Applied Soft Computing
Keywords
Field
DocType
SVM,Classification,Large data sets
Data point,Decision tree,Data set,Small data,Pattern recognition,Ranking SVM,Data selection,Computer science,Support vector machine,Artificial intelligence,Machine learning,Speedup
Journal
Volume
Issue
ISSN
37
C
1568-4946
Citations 
PageRank 
References 
7
0.61
50
Authors
5