Lossless filter for multiple repetitions with Hamming distance - Citegraph

Paper Info

Title
Lossless filter for multiple repetitions with Hamming distance

Abstract
Similarity search in texts, notably in biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been created in order to speed up the solution of the problem. However, previous filters were made for speeding up pattern matching, or for finding repetitions between two strings or occurring twice in the same string. In this paper, we present an algorithm called Nimbus for filtering strings prior to finding repetitions occurring twice or more in a string, or in two or more strings. Nimbus uses gapped seeds that are indexed with a new data structure, called a bi-factor array, that is also presented in this paper. Experimental results show that the filter can be very efficient: preprocessing with Nimbus a data set where one wants to find functional elements using a multiple local alignment tool such as Glam, the overall execution time can be reduced from 7.5 hours to 2 minutes.

Year	DOI	Venue
2008	10.1016/j.jda.2007.03.003	J. Discrete Algorithms
Keywords	Field	DocType
bi-factor array,multiple local alignment,approximate repetitions,hamming distance,functional element,indexing technique,bi-factors,gapped seed,multiple repetition,multiple local alignment tool,k-factors,numerous filtration,k -factors,lossless filter,overall execution time,new data structure,biological sequence,local alignment,indexation,pattern matching,data structure,similarity search	Data structure,Search engine indexing,Algorithm,Filter (signal processing),Hamming distance,Smith–Waterman algorithm,Pattern matching,Mathematics,Nearest neighbor search,Lossless compression	Journal
Volume	Issue	ISSN
6	3	Journal of Discrete Algorithms
Citations	PageRank	References
10	0.57	15
Authors
5

Authors (5 rows)

Cited by (10 rows)

References (15 rows)

Name	Order	Citations	PageRank
Pierre Peterlongo	1	390	20.28
Nadia Pisanti	2	302	30.91
Frédéric Boyer	3	92	8.64
Alair Pereira Do Lago	4	106	10.10
Marie-France Sagot	5	1337	109.23

1