Title | ||
---|---|---|
Using Kernel Density Classifier with Topic Model and Cost Sensitive Learning for Automatic Text Categorization |
Abstract | ||
---|---|---|
This paper proposes a novel framework for automatic text categorization problem based on the kernel density classifier. The overall goal is to tackle two main issues in automatic text categorization problems: the interpretability and the performance. Specifically, to solve the interpretability issue, the Latent Semantic Analysis technique is used to construct a topic space, in which each dimension represents a single topic. The text features are extracted directly from this topic space. To solve the performance issue, classifiers’ parameters are optimized for either cost-sensitive or non-cost-sensitive categorization. We have experimentally evaluated the proposed framework by using a corpus of twenty newsgroups. The experimental results confirm the effectiveness of the framework to utilize the features from the topic model for cost-sensitive categorization. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICDAR.2009.145 | ICDAR-1 |
Keywords | Field | DocType |
topic model,non-cost-sensitive categorization,text feature,single topic,automatic text categorization problem,cost sensitive learning,cost-sensitive categorization,kernel density classifier,proposed framework,novel framework,interpretability issue,topic space,automatic text categorization,niobium,latent semantic analysis,accuracy,context modeling,cost function,feature extraction,sparse matrices,kernel,estimation,distribution functions,support vector machines,bandwidth,kernel density,learning artificial intelligence,text analysis | Categorization,Interpretability,Pattern recognition,Computer science,Support vector machine,Context model,Artificial intelligence,Topic model,Latent semantic analysis,Classifier (linguistics),Machine learning,Kernel density estimation | Conference |
Citations | PageRank | References |
0 | 0.34 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dwi Sianto Mansjur | 1 | 7 | 2.28 |
Ted S. Wada | 2 | 37 | 6.37 |
Biing-Hwang Juang | 3 | 3388 | 699.72 |