Title
Optimizing Lossy Compression with Adjacent Snapshots for N-body Simulation Data
Abstract
Today's N-body simulations are producing extremely large amounts of data. The Hardware/Hybrid Accelerated Cosmology Code (HACC), for example, may simulate trillions of particles, producing tens of petabytes of data to store in a parallel file system, according to the HACC users. In this paper, we design and implement an efficient, in situ error-bounded lossy compressor to significantly reduce the data size for N-body simulations. Not only can our compressor save significant storage space for N-body simulation researchers, but it can also improve the I/O performance considerably with limited memory and computation overhead. Our contribution is threefold. (1) We propose an efficient data compression model by leveraging the consecutiveness of the cosmological data in both space and time dimensions as well as the physical correlation across different fields. (2) We propose a lightweight, efficient alignment mechanism to align the disordered particles across adjacent snapshots in the simulation, which is a fundamental step in the whole compression procedure. We also optimize the compression quality by exploring best-fit data prediction strategies and optimizing the frequencies of the space-based compression vs. time-based compression. (3) We evaluate our compressor using both a cosmological simulation package and molecular dynamics simulation data-two major categories in the N-body simulation domain. Experiments show that under the same distortion of data, our solution produces up to 43% higher compression ratios on the velocity field and up to 300% higher on the position field than do other state-of-the-art compressors (including SZ, ZFP, NUMARCK, and decimation). With our compressor, the overall I/O time on HACC data is reduced by up to 20% compared with the second-best compressor.
Year
DOI
Venue
2018
10.1109/BigData.2018.8622101
2018 IEEE International Conference on Big Data (Big Data)
Keywords
Field
DocType
Error-bounded lossy compression,N-body simulation,large science data,I/O performance
Data mining,Decimation,Lossy compression,Computer science,N-body simulation,Gas compressor,Computational science,Compression ratio,Data compression,Distortion,Computation
Conference
ISSN
ISBN
Citations 
2639-1589
978-1-5386-5036-3
3
PageRank 
References 
Authors
0.38
0
5
Name
Order
Citations
PageRank
Sihuan Li1656.01
Sheng Di273755.88
Xin Liang310712.74
Zizhong Chen492469.93
Franck Cappello53775251.47