Title
Clustering nodes in large-scale biological networks using external memory algorithms
Abstract
Novel analytical techniques have dramatically enhanced our understanding of many application domains including biological networks inferred from gene expression studies. However, there are clear computational challenges associated to the large datasets generated from these studies. The algorithmic solution of some NP-hard combinatorial optimization problems that naturally arise on the analysis of large networks is difficult without specialized computer facilities (i.e. supercomputers). In this work, we address the data clustering problem of large-scale biological networks with a polynomial-time algorithm that uses reasonable computing resources and is limited by the available memory. We have adapted and improved the MSTkNN graph partitioning algorithm and redesigned it to take advantage of external memory (EM) algorithms. We evaluate the scalability and performance of our proposed algorithm on a well-known breast cancer microarray study and its associated dataset.
Year
DOI
Venue
2011
10.1007/978-3-642-24669-2_36
ICA3PP (2)
Keywords
Field
DocType
external memory,biological network,mstknn graph,large datasets,available memory,associated dataset,polynomial-time algorithm,clustering node,proposed algorithm,large-scale biological network,external memory algorithm,large network,external memory algorithms,data clustering
Data mining,Biological network,Computer science,Parallel computing,Combinatorial optimization,Out-of-core algorithm,Cluster analysis,Graph partition,Memory architecture,Distributed computing,Scalability,Auxiliary memory
Conference
Volume
ISSN
Citations 
7017
0302-9743
4
PageRank 
References 
Authors
0.48
9
5
Name
Order
Citations
PageRank
Ahmed Shamsul Arefin142.17
Mario Inostroza-Ponta24111.08
Luke Mathieson39211.92
Regina Berretta44911.60
Pablo Moscato533437.27