On Analyzing Large Graphs Using GPUs - Citegraph

Paper Info

Title
On Analyzing Large Graphs Using GPUs

Abstract
Studying properties of graphs is essential to various applications, and recent growth of online social networks has spurred interests in analyzing their structures using Graphical Processing Units (GPUs). Utilizing the faster available shared memory on GPUs have provided tremendous speed-up for solving many general-purpose problems. However, when data required for processing is large and needs to be stored in the global memory instead of the shared memory, simultaneous memory accesses by threads in execution becomes the bottleneck for achieving higher throughput. In this paper, for storing large graphs, we propose and evaluate techniques to efficiently utilize the different levels of the memory hierarchy of GPUs, with the focus being on the larger global memory. Given a graph G = (V, E), we provide an algorithm to count the number of triangles in G, while storing the adjacency information on the global memory. Our computation techniques and data structure for retrieving the adjacency information is derived from processing the breadth-first-search tree of the input graph. Also, techniques to generate combinations of nodes for testing the properties of graphs induced by the same are discussed in detail. Our methods can be extended to solve other combinatorial counting problems on graphs, such as finding the number of connected sub graphs of size κ, number of cliques (resp. independent sets) of size κ, and related problems for large data sets. In the context of the triangle counting algorithm, we analyze and utilize primitives such as memory access coalescing and avoiding partition camping that offset the increase in access latency of using a slower but larger global memory. Our experimental results for the GPU implementation show at least 10 times speedup for triangle counting over the CPU counterpart. Another 6 -- 8% increase in performance is obtained by utilizing the above mentioned primitives as compared to the naïve implementation of the program on the GPU.

Year	DOI	Venue
2013	10.1109/IPDPSW.2013.235	Parallel and Distributed Processing Symposium Workshops & PhD Forum
Keywords	Field	DocType
large data set,memory access coalescing,adjacency information,shared memory,larger global memory,simultaneous memory access,large graphs,memory hierarchy,available shared memory,global memory,data structure,graph theory,data structures,testing,instruction sets	Uniform memory access,Shared memory,Computer science,Parallel computing,Distributed memory,Theoretical computer science,Out-of-core algorithm,Memory map,Flat memory model,Distributed shared memory,CUDA Pinned memory	Conference
ISBN	Citations	PageRank
978-0-7695-4979-8	6	0.53
References	Authors
12	3

Authors (3 rows)

Cited by (6 rows)

References (12 rows)

Name	Order	Citations	PageRank
Amlan Chatterjee	1	31	3.95
Sridhar Radhakrishnan	2	341	43.65
John K. Antonio	3	11	1.01

1