Scalable string similarity search/join with approximate seeds and multiple backtracking - Citegraph

Paper Info

Title
Scalable string similarity search/join with approximate seeds and multiple backtracking

Abstract
We present in this paper scalable algorithms for optimal string similarity search and join. Our methods are variations of those applied in Masai [15], our recently published tool for mapping high-throughput DNA sequencing data with unpreceded speed and accuracy. The key features of our approach are filtration with approximate seeds and methods for multiple backtracking. Approximate seeds, compared to exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds. Combined together, these two methods significantly speed up string similarity search and join operations. Our tool is implemented in C++ and OpenMP using the SeqAn library. The source code is distributed under the BSD license and can be freely downloaded from http://www.seqan.de/projects/edbt2013.

Year	DOI	Venue
2013	10.1145/2457317.2457386	EDBT/ICDT Workshops
Keywords	Field	DocType
bsd license,string similarity search,optimal string similarity search,approximate seed,exact seed,unpreceded speed,high-throughput dna,increase filtration specificity,multiple backtracking,seqan library,scalable string similarity search,computer science,filtration,radix tree,backtracking	Source code,Computer science,Parallel computing,Algorithm,Radix tree,Scalable algorithms,Suffix tree,Backtracking,String metric,Speedup,Scalability	Conference
Citations	PageRank	References
1	0.34	16
Authors
3

Authors (3 rows)

Cited by (1 rows)

References (16 rows)

Name	Order	Citations	PageRank
Enrico Siragusa	1	19	2.72
David Weese	2	252	17.79
Knut Reinert	3	1020	105.87

1