Title
Using Kernel Density Classifier with Topic Model and Cost Sensitive Learning for Automatic Text Categorization
Abstract
This paper proposes a novel framework for automatic text categorization problem based on the kernel density classifier. The overall goal is to tackle two main issues in automatic text categorization problems: the interpretability and the performance. Specifically, to solve the interpretability issue, the Latent Semantic Analysis technique is used to construct a topic space, in which each dimension represents a single topic. The text features are extracted directly from this topic space. To solve the performance issue, classifiers’ parameters are optimized for either cost-sensitive or non-cost-sensitive categorization. We have experimentally evaluated the proposed framework by using a corpus of twenty newsgroups. The experimental results confirm the effectiveness of the framework to utilize the features from the topic model for cost-sensitive categorization.
Year
DOI
Venue
2009
10.1109/ICDAR.2009.145
ICDAR-1
Keywords
Field
DocType
topic model,non-cost-sensitive categorization,text feature,single topic,automatic text categorization problem,cost sensitive learning,cost-sensitive categorization,kernel density classifier,proposed framework,novel framework,interpretability issue,topic space,automatic text categorization,niobium,latent semantic analysis,accuracy,context modeling,cost function,feature extraction,sparse matrices,kernel,estimation,distribution functions,support vector machines,bandwidth,kernel density,learning artificial intelligence,text analysis
Categorization,Interpretability,Pattern recognition,Computer science,Support vector machine,Context model,Artificial intelligence,Topic model,Latent semantic analysis,Classifier (linguistics),Machine learning,Kernel density estimation
Conference
Citations 
PageRank 
References 
0
0.34
8
Authors
3
Name
Order
Citations
PageRank
Dwi Sianto Mansjur172.28
Ted S. Wada2376.37
Biing-Hwang Juang33388699.72