Title
String Kernels Based on Variable-Length-Don't-Care Patterns
Abstract
We propose a new string kernel based on variable-length-don't-care patterns(VLDC patterns). A VLDC pattern is an element of (Σ茂戮驴 { 茂戮驴 })*, where Σis an alphabet and 茂戮驴 is the variable-length-don't-care symbol that matches any string in Σ*. The number of VLDC patterns matching a given string sof length nis O(22n). We present an O(n5 ) algorithm for computing the kernel value. We also propose variations of the kernel which modify the relative weights of each pattern. We evaluate our kernels using a support vector machine to classify spam data.
Year
DOI
Venue
2008
10.1007/978-3-540-88411-8_29
Discovery Science
Keywords
Field
DocType
vldc pattern,support vector machine,string sof length,relative weight,variable-length-don t-care symbol,spam data,variable-length-don t-care pattern,variable-length-don t-care patterns,new string kernel,kernel value,string kernel,pattern matching
Kernel (linear algebra),Commentz-Walter algorithm,Symbol,Computer science,Support vector machine,Theoretical computer science,Artificial intelligence,String kernel,Machine learning,Alphabet
Conference
Volume
ISSN
Citations 
5255
0302-9743
0
PageRank 
References 
Authors
0.34
12
5
Name
Order
Citations
PageRank
Kazuyuki Narisawa1336.82
Hideo Bannai262079.87
Hatano, Kohei38821.16
Shunsuke Inenaga459579.02
Masayuki Takeda590279.24