Title
Deconstructing Commodity Storage Clusters
Abstract
The traditional approach for characterizing complex systems is to run standard workloads and measure the resulting performance as seen by the end user. However, unique opportunities exist when characterizing a system that is itself constructed from standardized components: one can also look inside the system itself by instrumenting each of the components. In this paper, we show how intra-box instrumentation can help one understand the behavior of a large-scale storage cluster, the EMC Centera. In our analysis, we leverage standard tools for tracing both the disk and network traffic emanating from each node of the cluster. By correlating this traffic with the running workload, we are able to infer the structure of the software system (e.g., its write update protocol) as well as its policies (e.g., how it performs caching, replication, and load-balancing). Further, by imposing variable intra-box delays on network and disk traffic, we can confirm the causal relationships between network and disk events. Thus, we are able to infer the semantics of the messages between nodes without examining a single line of source code.
Year
DOI
Venue
2005
10.1109/ISCA.2005.20
ISCA
Keywords
Field
DocType
load balancing,protocols,source code,electromagnetic compatibility,storage area networks,load balance,instruction sets,software system,software systems,servers,complex system
Complex system,End user,Computer science,Source code,Workload,Parallel computing,Software system,Real-time computing,Storage area network,Tracing,Semantics
Conference
Volume
Issue
ISSN
33
2
0163-5964
ISBN
Citations 
PageRank 
0-7695-2270-X
22
2.84
References 
Authors
20
5
Name
Order
Citations
PageRank
Haryadi S. Gunawi155436.58
Nitin Agrawal299956.74
Andrea C. Arpaci-Dusseau33133307.84
Remzi H. Arpaci-Dusseau43120383.86
Jiri Schindler541126.82