Title
Suffix Array of Alignment: A Practical Index for Similar Data
Abstract
The <em>suffix tree of alignment</em> is an index data structure for similar strings. Given an alignment of similar strings, it stores all suffixes of the alignment, called <em>alignment-suffixes</em>. An alignment-suffix represents one suffix of a string or suffixes of multiple strings starting at the same position in the alignment. The suffix tree of alignment makes good use of similarity in strings theoretically. However, suffix trees are not widely used in biological applications because of their huge space requirements, and instead suffix arrays are used in practice. In this paper we propose a space-economical version of the suffix tree of alignment, named the <em>suffix array of alignment (SAA)</em>. Given an alignment <em>﾿</em> of similar strings, the SAA for <em>﾿</em> is a lexicographically sorted list of all the alignment-suffixes of <em>﾿</em>. The SAA supports pattern search as efficiently as the <em>generalized suffix array</em>. Our experiments show that our index uses only 14% of the space used by the generalized suffix array to index 11 human genome sequences. The space efficiency of our index increases as the number of the genome sequences increases. We also present an efficient algorithm for constructing the SAA.
Year
DOI
Venue
2013
10.1007/978-3-319-02432-5_27
SPIRE
Keywords
Field
DocType
alignments,indexes for similar data,suffix arrays
Data structure,Suffix,Computer science,Algorithm,Suffix array,Theoretical computer science,Generalized suffix tree,FM-index,Suffix tree,Compressed suffix array,Longest common substring problem
Conference
Volume
ISSN
Citations 
8214
0302-9743
7
PageRank 
References 
Authors
0.44
18
7
Name
Order
Citations
PageRank
Joong Chae Na116218.21
Heejin Park223521.63
Sunho Lee311413.16
Minsung Hong470.44
Thierry Lecroq566258.52
Laurent Mouchard625125.07
Kunsoo Park71396171.00