Reliable Generation of High-Performance Matrix Algebra - Citegraph

Paper Info

Title
Reliable Generation of High-Performance Matrix Algebra

Abstract
Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) to obtain portable high performance. However, many numerical algorithms require several BLAS calls in sequence, and those successive calls result in suboptimal performance. The entire sequence needs to be optimized in concert. Instead of vendor-tuned BLAS, a programmer could start with source code in Fortran or C (e.g., based on the Netlib BLAS) and use a state-of-the-art optimizing compiler. However, our experiments show that optimizing compilers often attain only one-quarter the performance of hand-optimized code. In this paper we present a domain-specific compiler for matrix algebra, the Build to Order BLAS (BTO), that reliably achieves high performance using a scalable search algorithm for choosing the best combination of loop fusion, array contraction, and multithreading for data parallelism. The BTO compiler generates code that is between 16% slower and 39% faster than hand-optimized code.

Year	DOI	Venue
2012	10.1145/2629698	ACM Transactions on Mathematical Software
Keywords	Field	DocType
domain specific languages,linear algebra,genetic algorithms	Loop fusion,Source code,Netlib,Computer science,Parallel computing,Fortran,Optimizing compiler,Theoretical computer science,Compiler,Data parallelism,Basic Linear Algebra Subprograms	Journal
Volume	Issue	ISSN
abs/1205.1098	3	0098-3500
Citations	PageRank	References
0	0.34	19
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (19 rows)

Name	Order	Citations	PageRank
Geoffrey Belter	1	9	1.54
Elizabeth R. Jessup	2	370	49.02
Jeremy G. Siek	3	563	45.96
Jeremy G. Siek	4	563	45.96
Boyana Norris	5	417	39.46

1