MPI framework for parallel searching in large biological databases - Citegraph

Paper Info

Title
MPI framework for parallel searching in large biological databases

Abstract
In this paper, we address the problem of searching huge biological databases on the scale of at least several gigabytes by utilizing parallel processing. Biological databases storing DNA sequences, protein sequences, or mass spectra are growing exponentially. Searches through these databases consume exponentially growing computational resources as well. We demonstrate herein a general use, MPI based, C++ framework for generically splitting databases amongst several computational nodes. The combined RAM of the nodes working in tandem is often sufficient to keep the entire database in memory, and therefore to search it efficiently without paging to disk. The framework runs as a persistent service, processing all submitted queries. This allows for query reordering and better utilization of the memory. Thereby, we achieve superlinear speedups compared to single processor implementations. We demonstrate the utility and speedup of the framework using a real biological database and an actual searching algorithm for mass spectrometry.

Year	DOI	Venue
2006	10.1016/j.jpdc.2006.08.003	J. Parallel Distrib. Comput.
Keywords	Field	DocType
splitting databases,biological databases,computational node,parallel searching,mass spectrum,large biological databases,huge biological databases,master/worker framework,entire database,real biological database,mpi,computational resource,mass spectrometry,parallel processing,mpi framework,protein sequence,biological database,dna sequence,mass spectra,search algorithm	Search algorithm,Computer science,Parallel processing,Gigabyte,Parallel computing,Biological database,Implementation,Paging,Message passing,Speedup	Journal
Volume	Issue	ISSN
66	12	Journal of Parallel and Distributed Computing
Citations	PageRank	References
3	0.43	11
Authors
2

Authors (2 rows)

Cited by (3 rows)

References (11 rows)

Name	Order	Citations	PageRank
Dominic Battré	1	257	20.40
David Sigfredo Angulo	2	3	1.44

1