Fast Schedule Tensor Computation on GPU with High Data Reuse and Device Utilization - Citegraph

Paper Info

Title
Fast Schedule Tensor Computation on GPU with High Data Reuse and Device Utilization

Abstract
Tensor computation, or computation on high-dimensional arrays, is widely used in deep learning, image processing, and scientific computation. And GPU has become the mainstream platform to accelerate computing. We propose an algorithm which can efficiently find a promising schedule to exploit the parallelism and locality of computation on GPU. In particular, an empirical model comprehensively considering locality, load balance and parallelism sufficiency of computation on given GPU model is designed to measure the quality of a candidate schedule. And empirical constraints are introduced to significantly reduce the searching space of schedule to polynomial complexity in terms of computation dimensions. Compared with the state-of-the-art tool, Tensor Comprehensions, our algorithm can find a promising schedule 5-45× faster, and the corresponding scheduled code runs 1.5-10× faster.

Year	DOI	Venue
2019	10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00084	2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
Keywords	DocType	ISBN
Tensor computation,scheduling,GPU	Conference	978-1-7281-4329-3
Citations	PageRank	References
0	0.34	0
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yuxiang Zhang	1	11	15.58
Yu Zhang	2	109	20.13

1