Title
Parallel continuous flow: a parallel suffix tree construction tool for whole genomes.
Abstract
The construction of suffix trees for very long sequences is essential for many applications, and it plays a central role in the bioinformatic domain. With the advent of modern sequencing technologies, biological sequence databases have grown dramatically. Also the methodologies required to analyze these data have become more complex everyday, requiring fast queries to multiple genomes. In this article, we present parallel continuous flow (PCF), a parallel suffix tree construction method that is suitable for very long genomes. We tested our method for the suffix tree construction of the entire human genome, about 3GB. We showed that PCF can scale gracefully as the size of the input genome grows. Our method can work with an efficiency of 90% with 36 processors and 55% with 172 processors. We can index the human genome in 7 minutes using 172 processes.
Year
DOI
Venue
2014
10.1089/cmb.2012.0256
JOURNAL OF COMPUTATIONAL BIOLOGY
DocType
Volume
Issue
Journal
21.0
4
ISSN
Citations 
PageRank 
1066-5277
0
0.34
References 
Authors
23
2
Name
Order
Citations
PageRank
Matteo Comin119120.94
Montse Farreras216312.94