Title
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.
Abstract
There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms - such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) - requires significant manual effort. We propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs. We also demonstrate TVMu0027s ability to target new accelerator back-ends, such as the FPGA-based generic deep learning accelerator. The system is open sourced and in production use inside several major companies.
Year
Venue
Field
2018
OSDI
Computer architecture,End-to-end principle,Computer science,Field-programmable gate array,Compiler,Optimizing compiler,Operator (computer programming),Software portability,Artificial intelligence,Deep learning,CAS latency,Distributed computing
DocType
Citations 
PageRank 
Conference
11
0.59
References 
Authors
0
12
Name
Order
Citations
PageRank
Tianqi Chen1188783.63
Thierry Moreau21058.54
Ziheng Jiang3677.19
Zheng, Lianmin4151.71
Eddie Q. Yan5443.53
Haichen Shen61638.06
Meghan Cowan7171.72
Leyuan Wang8282.74
Yuwei Hu9394.19
Luis Ceze102183125.93
Carlos Guestrin119220488.92
Arvind Krishnamurthy124540312.24