Abstract | ||
---|---|---|
As one of the most fundamental operations in graph analysis, subgraph matching is widely used in various fields such as social network analysis, knowledge graph query, and fraud detection. Due to its NP-complete complexity, sub-graph matching is challenging on large graphs. Previous work is limited on either scalability or the types of queries that can be handled. To address these problems, we propose a fast, scalable subgraph matching framework that consists of filtering, ordering, and enumeration stages. We exploit the parallelism in the filtering stage, and design a learning-based filtering method to remove false matching candidates; propose heuristic constraint and ordering generation methods to improve the matching efficiency; devise a distributed enumeration algorithm that is further optimized with the introduction of graph cache. Our learning- based filtering method delivers over 90% accuracy for basic queries. Compared with Prune.luice, our matching framework achieves 2–8 x speedup in triangle enumeration and up to 3–4 orders of magnitude higher throughput on generic query enumeration. The caching mechanism further boosts the performance by about 1.5 x to 2.5 x on average. Experiments also demonstrate the scalability of our framework.
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>
This work is supported by the Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory (PDL). The grant number is WDZC2020SS00101.,
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
The source code is available at https://github.com/yixinchen200S/FAST. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/HPEC55821.2022.9926298 | 2022 IEEE High Performance Extreme Computing Conference (HPEC) |
Keywords | DocType | ISSN |
scalable subgraph matching framework,fundamental operations,graph analysis,social network analysis,knowledge graph query,NP-complete complexity,sub-graph matching,fast subgraph matching framework,enumeration stages,filtering stage,learning-based filtering,false matching candidates,heuristic constraint,ordering generation methods,matching efficiency,distributed enumeration algorithm,graph cache,filtering method,basic queries,triangle enumeration,generic query enumeration | Conference | 2377-6943 |
ISBN | Citations | PageRank |
978-1-6654-9787-9 | 0 | 0.34 |
References | Authors | |
14 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiezhong He | 1 | 0 | 0.34 |
Zhouyang Liu | 2 | 0 | 0.34 |
Yixin Chen | 3 | 4326 | 299.19 |
Hengyue Pan | 4 | 0 | 0.34 |
Zhen Huang | 5 | 57 | 20.78 |
Dongsheng Li | 6 | 299 | 60.22 |