Title
Segmenting strings homogeneously via trees
Abstract
We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. For binary strings we develop a linear-time algorithm for any k. Key to efficiency is a special-purpose data structure, called W-tree, which reflects relations between repetition lengths of symbols. Existence of algorithms faster than obvious dynamic programming remains open for non-binary strings. Our problem is also equivalent to finding weighted independent sets of prescribed size in paths. We show that this problem in bounded-degree graphs is FPT.
Year
DOI
Venue
2007
10.1007/978-3-540-74839-7_21
WG
Keywords
Field
DocType
data mining,prescribed size,obvious dynamic programming,non-binary string,linear-time algorithm,bounded-degree graph,k segment,repetition length,segmenting strings homogeneously,binary string,special-purpose data structure,machine learning,parameterized complexity,data structure,independent set,segmentation,dynamic programming
Data structure,Dynamic programming,Discrete mathematics,Combinatorics,Parameterized complexity,Market segmentation,Computer science,Segmentation,sort,Greedy algorithm,Priority queue
Conference
Volume
ISSN
ISBN
4769
0302-9743
3-540-74838-5
Citations 
PageRank 
References 
0
0.34
5
Authors
1
Name
Order
Citations
PageRank
Peter Damaschke147156.99