Title
On Analyzing Large Graphs Using GPUs
Abstract
Studying properties of graphs is essential to various applications, and recent growth of online social networks has spurred interests in analyzing their structures using Graphical Processing Units (GPUs). Utilizing the faster available shared memory on GPUs have provided tremendous speed-up for solving many general-purpose problems. However, when data required for processing is large and needs to be stored in the global memory instead of the shared memory, simultaneous memory accesses by threads in execution becomes the bottleneck for achieving higher throughput. In this paper, for storing large graphs, we propose and evaluate techniques to efficiently utilize the different levels of the memory hierarchy of GPUs, with the focus being on the larger global memory. Given a graph G = (V, E), we provide an algorithm to count the number of triangles in G, while storing the adjacency information on the global memory. Our computation techniques and data structure for retrieving the adjacency information is derived from processing the breadth-first-search tree of the input graph. Also, techniques to generate combinations of nodes for testing the properties of graphs induced by the same are discussed in detail. Our methods can be extended to solve other combinatorial counting problems on graphs, such as finding the number of connected sub graphs of size κ, number of cliques (resp. independent sets) of size κ, and related problems for large data sets. In the context of the triangle counting algorithm, we analyze and utilize primitives such as memory access coalescing and avoiding partition camping that offset the increase in access latency of using a slower but larger global memory. Our experimental results for the GPU implementation show at least 10 times speedup for triangle counting over the CPU counterpart. Another 6 -- 8% increase in performance is obtained by utilizing the above mentioned primitives as compared to the naïve implementation of the program on the GPU.
Year
DOI
Venue
2013
10.1109/IPDPSW.2013.235
Parallel and Distributed Processing Symposium Workshops & PhD Forum
Keywords
Field
DocType
large data set,memory access coalescing,adjacency information,shared memory,larger global memory,simultaneous memory access,large graphs,memory hierarchy,available shared memory,global memory,data structure,graph theory,data structures,testing,instruction sets
Uniform memory access,Shared memory,Computer science,Parallel computing,Distributed memory,Theoretical computer science,Out-of-core algorithm,Memory map,Flat memory model,Distributed shared memory,CUDA Pinned memory
Conference
ISBN
Citations 
PageRank 
978-0-7695-4979-8
6
0.53
References 
Authors
12
3
Name
Order
Citations
PageRank
Amlan Chatterjee1313.95
Sridhar Radhakrishnan234143.65
John K. Antonio3111.01