Title
The Evaluation of DCNN on Vector-SIMD DSP.
Abstract
In current computer vision tasks, deep convolutional neural networks (DCNN) achieve state-of-the-art results. At present, the DCNN has been widely used in many usage scenarios. It is also important to develop DCNN on supercomputers in the age of artificial intelligence. As a general-purpose accelerator deployed in a supercomputer which ranks the fourth in the latest Top500 list, the Matrix2000 is a vector single instruction multiple data (vector-SIMD) digital signal processor (DSP). The research of DCNN implementation based on DSP is important and meaningful. Till now, there are few systematic studies published on DCNN implementation based on DSP. Qualcomm, CEVA, and Cadence have said that they have implemented DCNN on their own DSP. However, Qualcomm and CEVA have not published how they implemented DCNN based on their own DSP and Cadence has only implemented a convolution layer based on Vision P6. In this paper, we proposed a vectorization mapping method and a high efficient partition analysis model for implementing DCNN on vector-SIMD DSP. Based on the vectorization mapping method and analysis model, we implemented all layers of typical DCNN models on Matrix2000 and tested the computation and energy efficiency. The experiments demonstrate that the average computation efficiency of this paper based on Matrix2000 is 20 similar to 35% higher than GPU, 35 similar to 45% higher than Xeon Phi, about 8% higher than Vision P6 DSP, about 62 similar to 75% higher than an existing evaluation of DCNN based on Matrix2000, and the average energy efficiency is about 9 similar to 30% higher than GPU, and about 56% higher than the existing evaluation of DCNN based on Matrix2000. The results show that the vector-SIMD DSP with a suitable programming mapping method is also a suitable platform in the age of artificial intelligence.
Year
DOI
Venue
2019
10.1109/ACCESS.2019.2898711
IEEE ACCESS
Keywords
Field
DocType
DCNN,DSP,vectorization mapping,partition,CuDNN,GPU
Digital signal processing,Computer science,Parallel computing,SIMD,Distributed computing
Journal
Volume
ISSN
Citations 
7
2169-3536
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Chao Yang139939.13
Shuming Chen213838.21
Yaohua Wang34414.23
Junyang Zhang401.69