Abstract | ||
---|---|---|
The graph similarity join retrieves all pairs of similar graphs on graph datasets. In this paper, we propose an efficient MapReduce-friendly algorithm tackling with the graph similarity join problem on large-scale graph datasets. In particular, the efficiency of our algorithm is guaranteed by: 1) scalable prefix-filtering suitable for q-gram alphabet that is beyond the memory; 2) an effective candidate reduction strategy that greatly cuts down the data communication cost; 3) a two-round data access proposal that reduces the data access overhead. Extensive experiments on large-scale real and synthetic datasets demonstrate that our proposal outperforms the state-of-the-art method with higher system scalability and faster speed. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1007/978-3-319-08010-9_43 | WEB-AGE INFORMATION MANAGEMENT, WAIM 2014 |
Field | DocType | Volume |
Edit distance,Reduction strategy,Data mining,Graph database,Graph similarity,Computer science,Filter (signal processing),Prefix,Theoretical computer science,Data access,Scalability | Conference | 8485 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
9 | 5 |