Abstract | ||
---|---|---|
As a new generation of parallel supercomputers enables researchers to conduct scientific simulations of unprecedented scale and resolution, terabyte-scale simulation output has become increasingly commonplace. Analysis of such massive data sets is typically I/O-bound: many parallel analysis programs spend most of their execution time reading data from disk rather than performing useful computation. To overcome this I/O bottleneck, we have developed a new data access method. Our main idea is to cache a copy of simulation output files on the local disks of an analysis cluster's compute nodes, and to use a novel task-assignment protocol to co-locate data access with computation. We have implemented our methodology in a parallel disk cache system called Zazen. By avoiding the overhead associated with querying metadata servers and by reading data in parallel from local disks, Zazen is able to deliver a sustained read bandwidth of over 20 gigabytes per second on a commodity Linux cluster with 100 nodes, approaching the optimal aggregated I/O bandwidth attainable on these nodes. Compared with conventional NFS, PVFS2, and Hadoop/HDFS, respectively, Zazen is 75, 18, and 6 times faster for accessing large (1-GB) files, and 25, 13, and 85 times faster for accessing small (2-MB) files. We have deployed Zazen in conjunction with Anton--a special-purpose supercomputer that dramatically accelerates molecular dynamics (MD) simulations-- and have been able to accelerate the parallel analysis of terabyte-scale MD trajectories by about an order of magnitude. |
Year | Venue | Keywords |
---|---|---|
2010 | FAST | scientific simulation data,parallel disk cache system,new data access method,local disk,analysis cluster,parallel supercomputers,parallel analysis program,data access,massive data set,execution time reading data,parallel analysis,molecular dynamic,linux cluster |
Field | DocType | Citations |
Bottleneck,Disk buffer,Supercomputer,Cache,Computer science,Parallel computing,Server,Real-time computing,Bandwidth (signal processing),Data access,Computer cluster | Conference | 12 |
PageRank | References | Authors |
0.78 | 31 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tiankai Tu | 1 | 193 | 14.17 |
Charles A. Rendleman | 2 | 41 | 2.81 |
Patrick J. Miller | 3 | 41 | 3.14 |
Federico D. Sacerdoti | 4 | 98 | 11.74 |
Ron O. Dror | 5 | 439 | 40.56 |
David Elliot Shaw | 6 | 890 | 139.33 |