Title
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA.
Abstract
Long Short-Term Memory (LSTM) is widely used in speech recognition. In order to achieve higher prediction accuracy, machine learning scientists have built increasingly larger models. Such large model is both computation intensive and memory intensive. Deploying such bulky model results in high power consumption and leads to a high total cost of ownership (TCO) of a data center. To speedup the prediction and make it energy efficient, we first propose a load-balance-aware pruning method that can compress the LSTM model size by 20x (10x from pruning and 2x from quantization) with negligible loss of the prediction accuracy. The pruned model is friendly for parallel processing. Next, we propose a scheduler that encodes and partitions the compressed model to multiple PEs for parallelism and schedule the complicated LSTM data flow. Finally, we design the hardware architecture, named Efficient Speech Recognition Engine (ESE) that works directly on the sparse LSTM model. Implemented on Xilinx KU060 FPGA running at 200MHz, ESE has a performance of 282 GOPS working directly on the sparse LSTM network, corresponding to 2.52 TOPS on the dense one, and processes a full LSTM for speech recognition with a power dissipation of 41 Watts. Evaluated on the LSTM for speech recognition benchmark, ESE is 43x and 3x faster than Core i7 5930k CPU and Pascal Titan X GPU implementations. It achieves 40x and 11.5x higher energy efficiency compared with the CPU and GPU respectively.
Year
DOI
Venue
2017
10.1145/3020078.3021745
FPGA
Keywords
Field
DocType
Deep Learning,Speech Recognition,Model Compression,Hardware Acceleration,Software-Hardware Co-Design,FPGA
Computer science,Real-time computing,Artificial intelligence,Deep learning,Data flow diagram,Speedup,Central processing unit,Parallel computing,Field-programmable gate array,Speech recognition,Hardware acceleration,Quantization (signal processing),Hardware architecture
Conference
Citations 
PageRank 
References 
115
3.75
18
Authors
12
Search Limit
100115
Name
Order
Citations
PageRank
Song Han1210279.81
Junlong Kang21164.78
Huizi Mao3127941.30
Yiming Hu463944.91
Xin Li553060.02
Yubin Li61207.32
Dongliang Xie725121.85
Hong Luo81165.83
Song Yao943821.18
Yu Wang102279211.60
Huazhong Yang112239214.90
William J. Dally12117821460.14