Title
Characterizing Deep-Learning I/O Workloads in TensorFlow.
Abstract
The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. In fact, during the training, a considerable high number of relatively small files are first loaded and pre-processed on CPUs and then moved to accelerator for computation. In addition, checkpointing and restart operations are carried out to allow DL computing frameworks to restart quickly from a checkpoint. Because of this, I/O affects the performance of DL applications. In this work, we characterize the I/O performance and scaling of TensorFlow, an open-source programming framework developed by Google and specifically designed for solving DL problems. To measure TensorFlow I/O performance, we first design a micro-benchmark to measure TensorFlow reads, and then use a TensorFlow mini-application based on AlexNet to measure the performance cost of I/O and checkpointing in TensorFlow. To improve the checkpointing performance, we design and implement a burst buffer. We find that increasing the number of threads increases TensorFlow bandwidth by a maximum of 2.3× and 7.8× on our benchmark environments. The use of the tensorFlow prefetcher results in a complete overlap of computation on accelerator and input pipeline on CPU eliminating the effective cost of I/O on the overall performance. The use of a burst buffer to checkpoint to a fast small capacity storage and copy asynchronously the checkpoints to a slower large capacity storage resulted in a performance improvement of 2.6× with respect to checkpointing directly to slower storage on our benchmark environment.
Year
DOI
Venue
2018
10.1109/PDSW-DISCS.2018.00011
2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS)
Keywords
DocType
Volume
Training,Pipelines,Checkpointing,Prefetching,Benchmark testing,Google
Conference
abs/1810.03035
Citations 
PageRank 
References 
8
0.48
0
Authors
7
Name
Order
Citations
PageRank
Steven Wei Der Chien1353.24
Stefano Markidis220728.78
Chaitanya Prasad Sishtla390.85
Luís Santos411014.58
Pawel Herman581.50
Sai Narasimhamurthy6101.54
Erwin Laure736944.71