Title
Distributed large-scale natural graph factorization
Abstract
Natural graphs, such as social networks, email graphs, or instant messaging patterns, have become pervasive through the internet. These graphs are massive, often containing hundreds of millions of nodes and billions of edges. While some theoretical models have been proposed to study such graphs, their analysis is still difficult due to the scale and nature of the data. We propose a framework for large-scale graph decomposition and inference. To resolve the scale, our framework is distributed so that the data are partitioned over a shared-nothing set of machines. We propose a novel factorization technique that relies on partitioning a graph so as to minimize the number of neighboring vertices rather than edges across partitions. Our decomposition is based on a streaming algorithm. It is network-aware as it adapts to the network topology of the underlying computational hardware. We use local copies of the variables and an efficient asynchronous communication protocol to synchronize the replicated values in order to perform most of the computation without having to incur the cost of network communication. On a graph of 200 million vertices and 10 billion edges, derived from an email communication network, our algorithm retains convergence properties while allowing for almost linear scalability in the number of computers.
Year
DOI
Venue
2013
10.1145/2488388.2488393
WWW
Keywords
Field
DocType
convergence property,social network,network communication,natural graph,efficient asynchronous communication protocol,billion edge,network topology,large-scale graph decomposition,email graph,large-scale natural graph factorization,email communication network,graph factorization,matrix factorization
Pseudoforest,Strength of a graph,Modular decomposition,Computer science,Implicit graph,Mixed graph,Theoretical computer science,Artificial intelligence,Distributed computing,Factor graph,World Wide Web,Graph factorization,Independent set,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-4503-2035-1
102
4.55
References 
Authors
20
5
Search Limit
100102
Name
Order
Citations
PageRank
Amr Ahmed1174392.13
Nino Shervashidze237413.90
Shravan M. Narayanamurthy340719.83
Vanja Josifovski42265148.84
Alexander J. Smola5196271967.09