Title
Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory
Abstract
Scratchpad Memory (SPM) is widely used in emerging domain-specific architectures and accelerators for improving energy efficiency and time predictability. Typically, SPM-based architectures use DMA for fetching data from off-chip memory and global load instructions for loading fine-grained data directly into registers. For such architectures, neither capacity-only nor bandwidth-only loop tiling can efficiently use the bandwidth and SPM. This paper introduces a bandwidth-aware loop tiling approach that enables a tradeoff between SPM space utilization and bandwidth utilization to be made, by leveraging a runtime tiling framework and a cross-host-kernel IPA. Experimental results demonstrate that our approach can achieve the performance improvement of up to 4x, with a geometric average of 26%.
Year
DOI
Venue
2020
10.1145/3410463.3414637
PACT '20: International Conference on Parallel Architectures and Compilation Techniques Virtual Event GA USA October, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-8075-1
1
PageRank 
References 
Authors
0.38
0
9
Name
Order
Citations
PageRank
Mingchuan Wu110.38
Ying Liu253.55
Huimin Cui310.38
Qingfu Wei410.38
Quanfeng Li510.38
Limin Li610.38
Fang Lv7163.65
Jingling Xue81627124.20
Xiaobing Feng9906112.55