Title
An SSD-based eigensolver for spectral analysis on billion-node graphs.
Abstract
Many eigensolvers such as ARPACK and Anasazi have been developed to compute eigenvalues of a large sparse matrix. These eigensolvers are limited by the capacity of RAM. They run in memory of a single machine for smaller eigenvalue problems and require the distributed memory for larger problems. In contrast, we develop an SSD-based eigensolver framework called FlashEigen, which extends Anasazi eigensolvers to SSDs, to compute eigenvalues of a graph with hundreds of millions or even billions of vertices in a single machine. FlashEigen performs sparse matrix multiplication in a semi-external memory fashion, i.e., we keep the sparse matrix on SSDs and the dense matrix in memory. We store the entire vector subspace on SSDs and reduce I/O to improve performance through caching the most recent dense matrix. Our result shows that FlashEigen is able to achieve 40%-60% performance of its in-memory implementation and has performance comparable to the Anasazi eigensolvers on a machine with 48 CPU cores. Furthermore, it is capable of scaling to a graph with 3.4 billion vertices and 129 billion edges. It takes about four hours to compute eight eigenvalues of the billion-node graph using 120 GB memory.
Year
Venue
Field
2016
arXiv: Distributed, Parallel, and Cluster Computing
Graph,Vertex (geometry),Computer science,Parallel computing,Distributed memory,Linear subspace,Scaling,Multi-core processor,Sparse matrix,Eigenvalues and eigenvectors,Distributed computing
DocType
Volume
Citations 
Journal
abs/1602.01421
2
PageRank 
References 
Authors
0.36
12
5
Name
Order
Citations
PageRank
Da Zheng1958.61
Randal C. Burns284.19
Joshua T. Vogelstein327331.99
Carey E. Priebe4505108.56
Alexander S. Szalay5959105.36