Title
High-Precision BLAS on FPGA-enhanced Computers
Abstract
The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but powerful "FPGA-enhanced computers". In this paper, we introduced our efforts towards improving the computational performance of Basic Linear Algebra Subprograms (BLAS) by FPGA-specific algorithms/methods. Our study focus on three BLAS subroutines: floating point summation, matrix-vector multiplication, and matrix-matrix multiplication. They represent all three levels of BLAS functionalities, and their sustained computational performances are either memory bandwidth bounded or computation bounded. By proposing the group-alignment based floating-point summation method and applying this technique to other subroutines, we significantly improved their sustained computational performance and reduced numerical errors with moderate FPGA resources consumed. Comparing with existing FPGA-based implementations, our designs are efficient and compact with improved numerical accuracy and stability.
Year
Venue
Keywords
2007
ERSA
reconfigurable hardware,basic linear algebra subprograms,matrix multiplication,memory bandwidth,floating point
Field
DocType
Citations 
Memory bandwidth,Subroutine,Computer science,Floating point,Parallel computing,Field-programmable gate array,Computational science,Multiplication,Computation,Reconfigurable computing,Basic Linear Algebra Subprograms
Conference
2
PageRank 
References 
Authors
0.41
10
4
Name
Order
Citations
PageRank
Chuan He1556.23
Guan Qin29612.51
Richard E. Ewing325245.87
Wei Zhao43532404.01