Title
DORY - Lightweight memory hierarchy management for deep NN inference on IoT endnodes - work-in-progress.
Abstract
IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent hardware cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2X faster compared to its execution directly from L2 memory while consuming 1.9X less energy.
Year
DOI
Venue
2019
10.1145/3349567.3351726
CODES+ISSS
Keywords
Field
DocType
lightweight memory hierarchy management,deep NN inference,IoT endnodes,small L1 scratchpad memory,fast L1 scratchpad memory,coherent hardware cache hierarchy,labor-intensive explicit memory management,data memory footprint,deep neural network inference,lightweight software-cache,DORY leverages static data tiling,DMA-based double buffering,manual L1-L2 memory traffic management,DNN deployment oriented to memory,energy saving,large data memory footprint,static data tiling,CIFAR-10,storage capacity 142 Kbit
Dory,Computer architecture,Memory hierarchy,Computer science,Work in process,Inference,Parallel computing,Internet of Things
Conference
ISBN
Citations 
PageRank 
978-1-4503-6923-7
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Alessio Burrello166.01
Francesco Conti 0001212518.24
Angelo Garofalo342.47
Davide Rossi441647.47
Luca Benini5131161188.49