Title
Understanding Data Motion in the Modern HPC Data Center
Abstract
The utilization and performance of storage, compute, and network resources within HPC data centers have been studied extensively, but much less work has gone toward characterizing how these resources are used in conjunction to solve larger scientific challenges. To address this gap, we present our work in characterizing workloads and workflows at a data-center-wide level by examining all data transfers that occurred between storage, compute, and the external network at the National Energy Research Scientific Computing Center over a three-month period in 2019. Using a simple abstract representation of data transfers, we analyze over 100 million transfer logs from Darshan, HPSS user interfaces, and Globus to quantify the load on data paths between compute, storage, and the wide-area network based on transfer direction, user, transfer tool, source, destination, and time. We show that parallel I/O from user jobs, while undeniably important, is only one of several major I/O workloads that occurs throughout the execution of scientific workflows. We also show that this approach can be used to connect anomalous data traffic to specific users and file access patterns, and we construct time-resolved user transfer traces to demonstrate that one can systematically identify coupled data motion for individual workflows.
Year
DOI
Venue
2019
10.1109/PDSW49588.2019.00012
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)
Keywords
DocType
ISBN
data movement,storage,workflows
Conference
978-1-7281-6006-1
Citations 
PageRank 
References 
1
0.36
14
Authors
5
Name
Order
Citations
PageRank
Glenn K. Lockwood1204.06
Shane Snyder2648.38
Surendra Byna355139.65
Philip H. Carns496462.51
Nicholas J. Wright540827.79