Title
Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures.
Abstract
A bisection-type algorithm for the grammar-based compression of tree-structured data has been proposed recently. In this framework, an elementary ordered-tree grammar (EOTG) and an elementary unordered-tree grammar (EUTG) were defined, and an approximation algorithm was proposed.In this paper, we propose an integer programming-based method that finds the minimum context-free grammar (CFG) for a given string under the condition that at most two symbols appear on the right-hand side of each production rule. Next, we extend this method to find the minimum EOTG and EUTG grammars for given ordered and unordered trees, respectively. Then, we conduct computational experiments for the ordered and unordered artificial trees. Finally, we apply our methods to pattern extraction of glycan tree structures.We propose integer programming-based methods that find the minimum CFG, EOTG, and EUTG for given strings, ordered and unordered trees. Our proposed methods for trees are useful for extracting patterns of glycan tree structures.
Year
DOI
Venue
2010
10.1186/1471-2105-11-S11-S4
BMC Bioinformatics
Keywords
Field
DocType
computer experiment,computational biology,polysaccharides,microarrays,context free grammar,data compression,algorithms,tree structure,bioinformatics
Stochastic context-free grammar,Terminal and nonterminal symbols,ID/LP grammar,Grammar induction,Grammar-based code,Computer science,Algorithm,Theoretical computer science,Grammar,Tree structure,Bioinformatics,Data compression
Journal
Volume
Issue
ISSN
11 Suppl 11
Suppl 11
1471-2105
Citations 
PageRank 
References 
18
0.62
9
Authors
3
Name
Order
Citations
PageRank
Yang Zhao1836116.78
Morihiro Hayashida215421.88
Tatsuya Akutsu32169216.05