Title
Implementing Wilson-Dirac operator on the cell broadband engine
Abstract
Computing the actions of Wilson-Dirac operator contributes most of the CPU time for the grand challenge problem of simulating Lattice Quantum Chromodynamics (Lattice QCD). This routine exhibits many challenges in implementation on most computational environments because of the multiple patterns of accessing the same data, making it difficult to align the data efficiently at compile time. Additionally, the low computation to memory access ratio makes this computation bounded by the memory bandwidth and the memory latency. In this work, we present an implementation of this routine on the Cell Broadband Engine. We propose runtime data fusion, an approach that aims at re-aligning data at runtime, for data that cannot be aligned optimally at compile time, thus improving the performance of SIMDized execution. We also show a DMA optimization technique that reduces the impact of bandwidth limits on performance. Our implementation for this routine achieves 31.2 GFlops for single precision computations and 8.75 GFlops for double precision computations.
Year
DOI
Venue
2008
10.1145/1375527.1375532
I4CS
Keywords
DocType
Citations 
memory latency,memory bandwidth,cell broadband engine,runtime data fusion,bandwidth limit,memory access ratio,cpu time,lattice qcd,re-aligning data,implementing wilson-dirac operator,routine exhibit,double precision computation,multi core,dirac operator,quantum chromodynamics,data fusion,computer architecture
Conference
10
PageRank 
References 
Authors
0.82
8
2
Name
Order
Citations
PageRank
Khaled Z. Ibrahim121521.25
Francois Bodin2544.85