A DNA index structure using frequency and position information of genetic alphabet - Citegraph

Paper Info

Title
A DNA index structure using frequency and position information of genetic alphabet

Abstract
Exact match queries, wildcard match queries, and k-mismatch queries are widely used in lots of molecular biology applications including the searching of ESTs (Expressed Sequence Tag) and DNA transcription factors. In this paper, we suggest an efficient indexing and processing mechanism for such queries. Our indexing method places a sliding window at every possible location of a DNA sequence and extracts its signature by considering the occurrence frequency of each nucleotide. It then stores a set of signatures using a multi-dimensional index, such as the R*-tree. Also, by assigning a weight to each position of a window, it prevents signatures from being concentrated around a few spots in indexing space. Our query processing method converts a query sequence into a multi-dimensional rectangle and searches the index for the signatures overlapped with the rectangle.

Year	DOI	Venue
2005	10.1007/11430919_21	PAKDD
Keywords	Field	DocType
genetic alphabet,k-mismatch query,indexing method,multi-dimensional index,position information,dna index structure,processing mechanism,dna transcription factor,dna sequence,multi-dimensional rectangle,efficient indexing,exact match query,indexing space,expressed sequence tag,nucleotides,molecular biology,indexing,sliding window,genetics,transcription factor,indexation	R-tree,Data mining,Wildcard,Expressed sequence tag,Sliding window protocol,DNA database,Computer science,Rectangle,Search engine indexing,Algorithm,Knowledge extraction	Conference
Volume	ISSN	ISBN
3518	0302-9743	3-540-26076-5
Citations	PageRank	References
0	0.34	7
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (7 rows)

Name	Order	Citations	PageRank
Woo-Cheol Kim	1	37	5.46
Sanghyun Park	2	729	80.64
Jung-Im Won	3	86	10.56
Sang-Wook Kim	4	792	152.77
Jee-Hee Yoon	5	49	8.70

1