Title
A profile-guided synergistic computation framework for Halide.
Abstract
Recently, heterogeneous computing that incorporates the main processor(s) with accelerator(s) for boosting the performance of applications becomes popular. While joining forces of the accelerators could help improve performance, it may also sometimes produce the negative results. In particular, this happens during the execution of the image processing applications. Halide, in particular, has such a problem. Our previous study found that dynamically dispatching image processing tasks to the CPU and the GPU could often lead to prolonged execution time. In this paper, we propose a profile-guided job dispatching mechanism to better harness the computing power of the different types of computing elements. The proposed mechanism assigns the computation tasks onto the proper computing elements, based on the measured performance during the early rounds of the task execution. We implemented the proposed mechanism in the Halide framework. We evaluate the efficiency of the dispatching method with two benchmarks, including bilateral grid filters and local Laplacian filters using the CPU-only, the GPU-only and the hybrid CPU-GPU configurations. Our results show that the profile-guided approach boosts the performance with 1K resolution which is 52% faster than the dynamic approach for local Laplacian filters. On the other hand, for bilateral grid filters, the difference is within 7%. For local Laplacian filters with 8K resolution, the boosted performance is 38% faster than the dynamic approach. In addition, for bilateral grid filters, the difference is within 7%. As a result, it delivers better results than dispatching mechanism in previous work. Since the high-level C++ objects are offered to the programmers and the implementation details of the proposed method are hidden from them, the programmers are allowed to focus on the application logics rather than coordinating the computation between the heterogeneous computing elements.
Year
DOI
Venue
2017
10.1016/j.sysarc.2017.10.005
Journal of Systems Architecture
Keywords
Field
DocType
Image processing,Heterogeneous computing,Halide
8K resolution,Central processing unit,Computer science,Parallel computing,Image processing,Symmetric multiprocessor system,Real-time computing,Boosting (machine learning),Grid,Computation,Laplace operator
Journal
Volume
ISSN
Citations 
81
1383-7621
0
PageRank 
References 
Authors
0.34
13
3
Name
Order
Citations
PageRank
Shih-wei Liao170392.73
Chia-Lung Kao200.34
Yuki Shimizu3116.91