High-Precision BLAS on FPGA-enhanced Computers - Citegraph

Paper Info

Title
High-Precision BLAS on FPGA-enhanced Computers

Abstract
The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but powerful "FPGA-enhanced computers". In this paper, we introduced our efforts towards improving the computational performance of Basic Linear Algebra Subprograms (BLAS) by FPGA-specific algorithms/methods. Our study focus on three BLAS subroutines: floating point summation, matrix-vector multiplication, and matrix-matrix multiplication. They represent all three levels of BLAS functionalities, and their sustained computational performances are either memory bandwidth bounded or computation bounded. By proposing the group-alignment based floating-point summation method and applying this technique to other subroutines, we significantly improved their sustained computational performance and reduced numerical errors with moderate FPGA resources consumed. Comparing with existing FPGA-based implementations, our designs are efficient and compact with improved numerical accuracy and stability.

Year	Venue	Keywords
2007	ERSA	reconfigurable hardware,basic linear algebra subprograms,matrix multiplication,memory bandwidth,floating point
Field	DocType	Citations
Memory bandwidth,Subroutine,Computer science,Floating point,Parallel computing,Field-programmable gate array,Computational science,Multiplication,Computation,Reconfigurable computing,Basic Linear Algebra Subprograms	Conference	2
PageRank	References	Authors
0.41	10	4

Authors (4 rows)

Cited by (2 rows)

References (10 rows)

Name	Order	Citations	PageRank
Chuan He	1	55	6.23
Guan Qin	2	96	12.51
Richard E. Ewing	3	252	45.87
Wei Zhao	4	3532	404.01

1