Abstract | ||
---|---|---|
BLAST is the standard tool that molecular biologists use to search for sequence similarity in genomic (and protein) databases. It employs a brute force approach of comparing a query sequence against every database sequence - for each pair of the sequences to be matched, BLAST searches for short fixed-length word pairs (seeds) in the sequences and then extends them to higher-scoring regions. To search multiple queries, the basic approach is to run BLAST on each of the queries one at a time. This is clearly inefficient and fails to exploit common subsequences that the collection of queries may share. In this paper, we propose a new genome search tool, BLAST++, that allows multiple, say K, queries to be searched against a database concurrently. The design of BLAST++ is based on our observation that the seed searching step of BLAST is a bottleneck that consumes more than 80% of the total response time! BLAST++ essentially treats a collection of queries as a single virtual query so that the seed searching step needs to be performed only once for common subsequences. We implemented BLAST++ as an extension of the NCBI BLAST, and evaluated its performance. Our study shows that the results obtained by BLAST++ are identical to that obtained by executing BLAST on each of the K queries, but the single-process version of BLAST++ completes the processing in a much shorter time, about only 25% of the original single-process version of NCBI BLAST. |
Year | Venue | Keywords |
---|---|---|
2003 | APBC | blast search,blasting query,query sequence,total response time,database sequence,ncbi blast,new genome search tool,common subsequence,sequence similarity,shorter time,k query,seed,blast,cluster |
Field | DocType | ISBN |
Bottleneck,Data mining,Computer science,Response time,Rock blasting,Exploit,Brute force,Bioinformatics | Conference | 0-909-92597-6 |
Citations | PageRank | References |
6 | 0.77 | 7 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hao Wang | 1 | 10 | 1.50 |
Twee-Hee Ong | 2 | 12 | 1.61 |
Beng Chin Ooi | 3 | 7873 | 1076.70 |
Kian-Lee Tan | 4 | 6962 | 776.65 |