Indexing Genomic Databases for Fast Homology Searching - Citegraph

Paper Info

Title
Indexing Genomic Databases for Fast Homology Searching

Abstract
Genomic sequence databases has been widely used by molecular biologists for homology searching. However, as amino acid and nucleotide databases are growing in size at an alarming rate, traditional brute force approach of comparing a query sequence against each of the database sequences is becoming prohibitively expensive. In this paper, we re-examine the problem of searching for homology in large protein databases. We proposed a novel filter-and-refine approach to speed up the search process. The scheme operates in two phases. In the filtering phase, a small set of candidate database sequences (as compared to all sequences in the database) is quickly identified. This is realized using a signature-based scheme. In the refinement phase, the query sequence is matched against the sequences in the candidate set using any local alignment strategies. Our preliminary experimental results show that the proposed method results in significant savings in computation without sacrificing on the accuracy of the answers as compared to FASTA.

Year	DOI	Venue
2002	10.1007/3-540-46146-9_86	DEXA
Keywords	Field	DocType
proposed method result,novel filter-and-refine approach,query sequence,database sequence,refinement phase,candidate database sequence,large protein databases,signature-based scheme,indexing genomic databases,nucleotide databases,genomic sequence databases,fast homology searching,genome sequence,local alignment,nucleotides,indexation,amino acid	Data mining,Computer science,Search engine indexing,Filter (signal processing),Brute force,Smith–Waterman algorithm,Homology (biology),Small set,Database,Speedup,Computation	Conference
ISBN	Citations	PageRank
3-540-44126-3	1	0.38
References	Authors
5	3

Authors (3 rows)

Cited by (1 rows)

References (5 rows)

Name	Order	Citations	PageRank
Twee-Hee Ong	1	12	1.61
Kian-Lee Tan	2	6962	776.65
Hao Wang	3	16	3.28

1