Title
Bounded Occurrence Edit Distance: A New Metric for String Similarity Joins with Edit Distance Constraints.
Abstract
Given two sets of strings and a similarity function on strings, similarity joins attempt to find all similar pairs of strings from each respective set. In this paper, we focus on similarity joins with respect to the edit distance, and propose a new metric called the bounded occurrence edit distance and a filter based on the metric. Using the filter, we can reduce the total time required to solve similarity joins because the metric can be computed faster than the edit distance by bitwise operations. We demonstrate the effectiveness of the filter through experiments.
Year
DOI
Venue
2014
10.1007/978-3-319-04298-5_32
Lecture Notes in Computer Science
Keywords
Field
DocType
Edit distance,Similarity join problem,Similarity search,Data integration
String-to-string correction problem,Edit distance,Discrete mathematics,Joins,Combinatorics,Computer science,Wagner–Fischer algorithm,Jaro–Winkler distance,Damerau–Levenshtein distance,String metric,Nearest neighbor search
Conference
Volume
ISSN
Citations 
8327
0302-9743
0
PageRank 
References 
Authors
0.34
9
4
Name
Order
Citations
PageRank
Tomoki Komatsu100.34
Ryosuke Okuta200.34
Kazuyuki Narisawa3336.82
Ayumi Shinohara493688.28