ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions - Citegraph

Paper Info

Title
ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions

Abstract
Deep Neural Networks (DNNs) are becoming the prevalent approach in computer vision, machine learning, natural language processing, and speech recognition applications. Although DNNs are perceived as compute-intensive tasks, they also apply intense pressure on the capacity and bandwidth of the memory hierarchy, primarily due to the large intermediate data communicated across network layers. Prior work on hardware DNN accelerators leverages the cross-layer data sparsity via fully-customized datapaths. However, dynamically compressing/expanding such data is a challenging task for general-purpose multi-processors with virtual memory and hardware-managed coherent cache hierarchies. In this paper, we observe that the DNN intermediate data is either sequentially streamed or reshaped with a regular transformation between layers. Hence, accesses to this data can tolerate a sequential or block sequential compression/expansion without requiring random element retrieval. Based on this insight, we propose ZCOMP, a CPU vector ISA extension tailored for DNN cross-layer communication. ZCOMP compactly represents zero value compression/expansion and fully automates the metadata generation, storage and retrieval which eliminates the need for several extra instruction executions and register usage. ZCOMP can be targeted both for inference and training to dynamically compress/expand cross-layer data before being written to memory. Our evaluations for individual layers and end-to-end DNN networks demonstrate that ZCOMP offers substantial data traffic reduction, both on-chip across cache-hierarchy and off-chip to DRAM, and performance improvements over no compression and existing AVX512 compression approaches.

Year	DOI	Venue
2019	10.1145/3352460.3358305	Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture
Keywords	Field	DocType
CPU, Deep learning, ISA, compression, memory system, sparsity	Dram,Metadata,Memory hierarchy,Computer science,Inference,Virtual memory,Parallel computing,Bandwidth (signal processing),Artificial intelligence,Deep learning,Memory footprint	Conference
ISBN	Citations	PageRank
978-1-4503-6938-1	5	0.44
References	Authors
0	3

Authors (3 rows)

Cited by (5 rows)

References (0 rows)

Name	Order	Citations	PageRank
Berkin Akin	1	83	5.59
Zeshan Chishti	2	723	34.65
Alaa R. Alameldeen	3	1672	80.06

1