Title
Exploring I/O Strategies for Parallel Sequence-Search Tools with S3aSim.
Abstract
Parallel sequence-search tools are rising in popularity among computational biologists. With the rapid growth of sequence databases, database segmentation is the trend of the future for such search tools. While I/O currently is not a significant bottleneck for parallel sequence-search tools, future technologies including faster processors, customized computational hardware such as FPGAs, improved search algorithms, and exponentially growing databases will em- phasize an increasing need for efficient parallel I/O in fu- ture parallel sequence-search tools. Our paper focuses on examining different I/O strate- gies for these future tools in a modern parallel file sys- tem (PVFS2). Because implementing and comparing var- ious I/O algorithms in every search tool is labor-intensive and time-consuming, we introduce S3aSim, a general sim- ulation framework for sequence-search which allows us to quickly implement, test, and profile various I/O strategies. We examine a variety of I/O strategies (e.g., master-writing and various worker-writing strategies using individual and collective I/O methods) for storing result data in sequence- search tools such as mpiBLAST, pioBLAST, and parallel HMMer. Our experiments fully detail the interaction of computing and I/O within a full application simulation as opposed to typical I/O-only benchmarks.
Year
DOI
Venue
2006
10.1109/HPDC.2006.1652154
HPDC
Keywords
Field
DocType
database management systems,search algorithm,computational biologist,parallel processing
Bottleneck,File system,Search algorithm,Computer science,Segmentation,Parallel computing,Parallel processing,Field-programmable gate array,Input/output,Distributed computing
Conference
ISSN
ISBN
Citations 
1082-8907
1-4244-0307-3
4
PageRank 
References 
Authors
0.50
11
5
Name
Order
Citations
PageRank
Avery Ching122116.21
Wu-chun Feng22812232.50
Heshan Lin337523.13
Xiaosong Ma4111768.36
Alok N. Choudhary53441326.32