GrayWulf: Scalable Clustered Architecture for Data Intensive Computing - Citegraph

Paper Info

Title
GrayWulf: Scalable Clustered Architecture for Data Intensive Computing

Abstract
Data intensive computing presents a significant challenge for traditional supercomputing architectures that maximize FLOPS since CPU speed has surpassed IO capabilities of HPC systems and BeoWulf clusters. We present the architecture for a three tier commodity component cluster designed for a range of data intensive computations operating on petascale data sets named GrayWulf. The design goal is a balanced system in terms of IO performance and memory size, according to Amdahl's Laws. The hardware currently installed at JHU exceeds one petabyte of storage and has 0.5 bytes/sec of I/O and 1 byte of memory for each CPU cycle. The GrayWulf provides almost an order of magnitude better balance than existing systems. The paper covers its architecture and reference applications. The software design is presented in a companion paper.

Year	DOI	Venue
2009	10.1109/HICSS.2009.234	HICSS
Keywords	Field	DocType
data intensive computing,software design	Byte,Data-intensive computing,Supercomputer,Computer science,Petabyte,Amdahl's law,Parallel computing,Petascale computing,Instruction cycle,Operating system,Scalability	Conference
Citations	PageRank	References
27	1.93	10
Authors
12

Authors (12 rows)

Cited by (27 rows)

References (10 rows)

Name	Order	Citations	PageRank
Alexander S. Szalay	1	959	105.36
Gordon Bell	2	27	2.27
Jan Vandenberg	3	255	32.25
Alainna Wonders	4	27	2.27
Randal Burns	5	1955	115.15
Dan Fay	6	72	6.84
Jim Heasley	7	27	1.93
Anthony J. G. Hey	8	287	40.29
María A. Nieto-santisteban	9	98	11.03
Ani Thakar	10	330	48.74
Catharine Van Ingen	11	210	21.45
Richard Wilton	12	28	2.62

1