Title
BLAST++: a tool for BLASTing queries in batches
Abstract
BLAST is the standard tool that molecular biologists use to search for sequence similarity in genomic (and protein) databases. It employs a brute force approach of comparing a query sequence against every database sequence - for each pair of the sequences to be matched, BLAST searches for short fixed-length word pairs (seeds) in the sequences and then extends them to higher-scoring regions. To search multiple queries, the basic approach is to run BLAST on each of the queries one at a time. This is clearly inefficient and fails to exploit common subsequences that the collection of queries may share. In this paper, we propose a new genome search tool, BLAST++, that allows multiple, say K, queries to be searched against a database concurrently. The design of BLAST++ is based on our observation that the seed searching step of BLAST is a bottleneck that consumes more than 80% of the total response time! BLAST++ essentially treats a collection of queries as a single virtual query so that the seed searching step needs to be performed only once for common subsequences. We implemented BLAST++ as an extension of the NCBI BLAST, and evaluated its performance. Our study shows that the results obtained by BLAST++ are identical to that obtained by executing BLAST on each of the K queries, but the single-process version of BLAST++ completes the processing in a much shorter time, about only 25% of the original single-process version of NCBI BLAST.
Year
Venue
Keywords
2003
APBC
blast search,blasting query,query sequence,total response time,database sequence,ncbi blast,new genome search tool,common subsequence,sequence similarity,shorter time,k query,seed,blast,cluster
Field
DocType
ISBN
Bottleneck,Data mining,Computer science,Response time,Rock blasting,Exploit,Brute force,Bioinformatics
Conference
0-909-92597-6
Citations 
PageRank 
References 
6
0.77
7
Authors
4
Name
Order
Citations
PageRank
Hao Wang1101.50
Twee-Hee Ong2121.61
Beng Chin Ooi378731076.70
Kian-Lee Tan46962776.65