FAST: A Scalable Subgraph Matching Framework over Large Graphs - Citegraph

Paper Info

Title
FAST: A Scalable Subgraph Matching Framework over Large Graphs

Abstract
As one of the most fundamental operations in graph analysis, subgraph matching is widely used in various fields such as social network analysis, knowledge graph query, and fraud detection. Due to its NP-complete complexity, sub-graph matching is challenging on large graphs. Previous work is limited on either scalability or the types of queries that can be handled. To address these problems, we propose a fast, scalable subgraph matching framework that consists of filtering, ordering, and enumeration stages. We exploit the parallelism in the filtering stage, and design a learning-based filtering method to remove false matching candidates; propose heuristic constraint and ordering generation methods to improve the matching efficiency; devise a distributed enumeration algorithm that is further optimized with the introduction of graph cache. Our learning- based filtering method delivers over 90% accuracy for basic queries. Compared with Prune.luice, our matching framework achieves 2–8 x speedup in triangle enumeration and up to 3–4 orders of magnitude higher throughput on generic query enumeration. The caching mechanism further boosts the performance by about 1.5 x to 2.5 x on average. Experiments also demonstrate the scalability of our framework. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> This work is supported by the Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory (PDL). The grant number is WDZC2020SS00101., <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> The source code is available at https://github.com/yixinchen200S/FAST.

Year	DOI	Venue
2022	10.1109/HPEC55821.2022.9926298	2022 IEEE High Performance Extreme Computing Conference (HPEC)
Keywords	DocType	ISSN
scalable subgraph matching framework,fundamental operations,graph analysis,social network analysis,knowledge graph query,NP-complete complexity,sub-graph matching,fast subgraph matching framework,enumeration stages,filtering stage,learning-based filtering,false matching candidates,heuristic constraint,ordering generation methods,matching efficiency,distributed enumeration algorithm,graph cache,filtering method,basic queries,triangle enumeration,generic query enumeration	Conference	2377-6943
ISBN	Citations	PageRank
978-1-6654-9787-9	0	0.34
References	Authors
14	6

Authors (6 rows)

Cited by (0 rows)

References (14 rows)

Name	Order	Citations	PageRank
Jiezhong He	1	0	0.34
Zhouyang Liu	2	0	0.34
Yixin Chen	3	4326	299.19
Hengyue Pan	4	0	0.34
Zhen Huang	5	57	20.78
Dongsheng Li	6	299	60.22

1