Making TifaMMy fit for tomorrow: Towards future shared memory systems and beyond - Citegraph

Paper Info

Title
Making TifaMMy fit for tomorrow: Towards future shared memory systems and beyond

Abstract
In this paper, we present the recent port to and latest results of our cache-oblivious algorithms and implementations of parallel LU decomposition code TifaMMy on two new architectures: SGI's UltraViolet distributed shared memory machine, and Intel's latest x86 architecture Sandy Bridge. TifaMMy's matrix multiplication and LU decomposition routines have been further optimized with regard to these new architectures. Results are discussed and compared with Intel's architecture specific and optimized numerical Math Kernel Library (MKL) for both the standard C++ version with vectorization compiler switches and TifaMMy's highly optimized vector intrinsics version.

Year	DOI	Venue
2011	10.1109/HPCSim.2011.5999869	High Performance Computing and Simulation
Keywords	Field	DocType
C++ language,cache storage,matrix decomposition,matrix multiplication,optimising compilers,parallel architectures,shared memory systems,C++ version,SGI UltraViolet,cache oblivious algorithm,distributed shared memory machine,matrix multiplication,optimized vector intrinsic version,parallel LU decomposition code TifaMMy,vectorization compiler switches,x86 architecture Sandy Bridge,block-recursive,cache-oblivious,linear algebra,parallelization,performance,shared memory platforms	Cache-oblivious algorithm,Shared memory,Computer science,Instruction set,Parallel computing,Vectorization (mathematics),Compiler,Distributed shared memory,Intrinsics,LU decomposition	Conference
ISBN	Citations	PageRank
978-1-61284-380-3	3	0.87
References	Authors
10	2

Authors (2 rows)

Cited by (3 rows)

References (10 rows)

Name	Order	Citations	PageRank
Alexander Heinecke	1	344	32.67
Carsten Trinitis	2	151	29.80

1