Gfp-X: A Parallel Approach To Massive Graph Comparison Using Spark - Citegraph

Paper Info

Title
Gfp-X: A Parallel Approach To Massive Graph Comparison Using Spark

Abstract
The problem of how to compare empirical graphs is an area of great interest within the field of network science. The ability to accurately but efficiently compare graphs has a significant impact in such areas as temporal graph evolution, anomaly detection and protein comparison. The comparison problem is compounded when working with massive graphs containing millions of vertices and edges.This paper introduces a parallel feature extraction based approach for the efficient comparison of large unlabelled graph datasets using Apache Spark. The approach acts by producing a 'Graph Fingerprint' which represents both vertex level and global level topological features from a graph. By using Spark we are able to efficiently compare graphs considered unmanageably large to other approaches. The runtime of the approach is shown to scale sub-linearly with the size and complexity of the graphs being fingerprinted. Importantly, the approach is shown to not only be comparable to existing approaches, but on when comparing topology and size, more sensitive at detecting variation between graphs.

Year	Venue	Keywords
2016	2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)	Graph Similarity, Feature Extraction, Spark
Field	DocType	Citations
Network science,Data mining,Anomaly detection,Modular decomposition,Spark (mathematics),Computer science,Theoretical computer science,Artificial intelligence,Graph operations,Vertex (geometry),Null model,Clique-width,Machine learning	Conference	2
PageRank	References	Authors
0.36	17	5

Authors (5 rows)

Cited by (2 rows)

References (17 rows)

Name	Order	Citations	PageRank
Stephen Bonner	1	44	7.88
John Brennan	2	18	4.36
Georgios K. Theodoropoulos	3	189	20.78
Ibad Kureshi	4	24	5.22
Andrew Stephen Mcgough	5	105	11.18

1