RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication - Citegraph

Paper Info

Title
RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication

Abstract
Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data e.g., row-major vs. column-major and the physical layout e.g., distribution on parallel filesystems. For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these factors, we present a partial data replication system called RADAR. We capture datatype- and collective-aware I/O access patterns indicating logical access via MPI-IO tracing and use a combination of coarse-grained and fine-grained performance modeling to evaluate and select optimized physical data distributions for the task at hand. Unlike conventional methods, we store all replica data and metadata, along with the original untouched data, under a single file container using the object abstraction in parallel filesystems. Our system results in manyfold improvements in some commonly used subvolume decomposition access patterns.Moreover, the modeling approach can determine whether such optimizations should be undertaken in the first place.

Year	DOI	Venue
2014	10.1007/978-3-319-07518-1_19	ISC
Field	DocType	Citations
Replica,Metadata,Replication (computing),Supercomputer,Computer science,Parallel computing,Real-time computing,Abstraction (linguistics),Data access,Tracing,Scalability	Conference	10
PageRank	References	Authors
0.50	42	6

Authors (6 rows)

Cited by (10 rows)

References (42 rows)

Name	Order	Citations	PageRank
John Jenkins	1	10	0.50
Xiaocheng Zou	2	64	5.90
Houjun Tang	3	53	15.97
Dries Kimpe	4	335	23.54
Robert Ross	5	2717	173.13
Nagiza F. Samatova	6	861	74.04

1