Title
Error Correcting Codes with Optimized Kullback-Leibler Distances for Text Categorization
Abstract
We extend a multi-class categorization scheme proposed by Dietterich and Bakiri 1995 for binary classifiers, using error correcting codes. The extension comprises the computation of the codes by a simulated annealing algorithm and optimization of Kullback-Leibler (KL) category distances within the code-words. For the first time, we apply the scheme to text categorization with support vector machines (SVMs) on several large text corpora with more than 100 categories. The results are compared to 1-of-N coding (i.e. one SVM for each text category). We also investigate codes with optimized KL distance between the text categories which are merged in the code-words. We find that error correcting codes perform better than 1-of-N coding with increasing code length. For very long codes, the performance is in some cases further improved by KL-distance optimization.
Year
DOI
Venue
2001
10.1007/3-540-44794-6_22
PKDD
Keywords
Field
DocType
binary classifier,kl-distance optimization,1-of-n coding,error correcting codes,multi-class categorization scheme,code length,category distance,text categorization,text category,large text corpus,optimized kl distance,optimized kullback-leibler distances,error correction code,kullback leibler distance,simulated annealing algorithm,support vector machine,kullback leibler
BCJR algorithm,Computer science,Artificial intelligence,Categorization,Concatenated error correction code,Pattern recognition,Block code,Support vector machine,Binary code,Algorithm,Error detection and correction,Linear code,Machine learning
Conference
ISBN
Citations 
PageRank 
3-540-42534-9
10
0.75
References 
Authors
9
3
Name
Order
Citations
PageRank
Jörg Kindermann141133.66
Gerhard Paass2113683.63
Edda Leopold338130.50