Title
Optimized dense matrix multiplication on a many-core architecture
Abstract
Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), belong to a new set of many-core-on-a-chip systems with a software managed memory hierarchy. New programming and compiling methodologies are required to fully exploit the potential of this new class of architectures. In this paper, we use dense matrix multiplication as a case of study to present a general methodology to map applications to these kinds of architectures. Our methodology exposes the following characteristics: (1) Balanced distribution of work among threads to fully exploit available resources. (2) Optimal register tiling and sequence of traversing tiles, calculated analytically and parametrized according to the register file size of the processor used. This results in minimal memory transfers and optimal register usage. (3) Implementation of architecture specific optimizations to further increase performance. Our experimental evaluation on a real C64 chip shows a performance of 44.12 GFLOPS, which corresponds to 55.2% of the peak performance of the chip. Additionally, measurements of power consumption prove that the C64 is very power efficient providing 530 MFLOPS/W for the problem under consideration.
Year
DOI
Venue
2010
10.1007/978-3-642-15291-7_29
Euro-Par (2)
Keywords
Field
DocType
c64 chip,increase performance,optimal register usage,many-core architecture,register file size,optimal register tiling,new architecture,new class,optimized dense matrix multiplication,new set,new programming,peak performance,register file,matrix multiplication,chip,parallel systems,power efficiency
Memory hierarchy,Computer science,Cache,Parallel computing,Register file,Thread (computing),Software,Multiplication,Matrix multiplication,Sparse matrix,Distributed computing
Conference
Volume
ISSN
ISBN
6272
0302-9743
3-642-15290-2
Citations 
PageRank 
References 
14
0.80
12
Authors
4
Name
Order
Citations
PageRank
Elkin Garcia1827.90
Ioannis E. Venetis2838.82
Rishi Khan3372.89
Guang R. Gao42661265.87