Title
Exploiting partial decision trees for feature subset selection in e-mail categorization
Abstract
In this paper we propose PARTfs which adopts a supervised machine learning algorithm, namely partial decision trees, as a method for feature subset selection. In particular, it is shown that an aggressive reduction of the feature space can be achieved with PARTfs while still allowing for comparable classification results with conventional feature selection metrics. The approach is empirically verified by employing two different document representations and four different text classification algorithms that are applied to a document collection consisting of personal e-mail messages. The results show that a reduction of the feature space in the magnitude of ten is achievable without loss of classification accuracy.
Year
DOI
Venue
2006
10.1145/1141277.1141536
SAC
Keywords
Field
DocType
partial decision tree,classification accuracy,feature subset selection,different text classification algorithm,document collection,different document representation,conventional feature selection metrics,e-mail categorization,aggressive reduction,feature space,comparable classification result,machine learning,col,feature selection,decision tree
k-nearest neighbors algorithm,Feature vector,Dimensionality reduction,Feature selection,Pattern recognition,Feature (computer vision),Computer science,Feature extraction,Artificial intelligence,Linear classifier,Machine learning,Feature learning
Conference
ISBN
Citations 
PageRank 
1-59593-108-2
5
0.50
References 
Authors
12
3
Name
Order
Citations
PageRank
Helmut Berger150.50
Dieter Merkl2846115.65
Michael Dittenbach329726.48